<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/arm64/kernel/entry.S, branch v4.15</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>arm64/sve: Detect SVE and activate runtime support</title>
<updated>2017-11-03T15:24:21+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-10-31T15:51:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=43994d824e8443263dc98b151e6326bf677be52e'/>
<id>43994d824e8443263dc98b151e6326bf677be52e</id>
<content type='text'>
This patch enables detection of hardware SVE support via the
cpufeatures framework, and reports its presence to the kernel and
userspace via the new ARM64_SVE cpucap and HWCAP_SVE hwcap
respectively.

Userspace can also detect SVE using ID_AA64PFR0_EL1, using the
cpufeatures MRS emulation.

When running on hardware that supports SVE, this enables runtime
kernel support for SVE, and allows user tasks to execute SVE
instructions and make of the of the SVE-specific user/kernel
interface extensions implemented by this series.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch enables detection of hardware SVE support via the
cpufeatures framework, and reports its presence to the kernel and
userspace via the new ARM64_SVE cpucap and HWCAP_SVE hwcap
respectively.

Userspace can also detect SVE using ID_AA64PFR0_EL1, using the
cpufeatures MRS emulation.

When running on hardware that supports SVE, this enables runtime
kernel support for SVE, and allows user tasks to execute SVE
instructions and make of the of the SVE-specific user/kernel
interface extensions implemented by this series.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Suzuki K Poulose &lt;suzuki.poulose@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/sve: Core task context handling</title>
<updated>2017-11-03T15:24:15+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-10-31T15:51:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bc0ee476036478a85beeed51f0d94c8729fd0544'/>
<id>bc0ee476036478a85beeed51f0d94c8729fd0544</id>
<content type='text'>
This patch adds the core support for switching and managing the SVE
architectural state of user tasks.

Calls to the existing FPSIMD low-level save/restore functions are
factored out as new functions task_fpsimd_{save,load}(), since SVE
now dynamically may or may not need to be handled at these points
depending on the kernel configuration, hardware features discovered
at boot, and the runtime state of the task.  To make these
decisions as fast as possible, const cpucaps are used where
feasible, via the system_supports_sve() helper.

The SVE registers are only tracked for threads that have explicitly
used SVE, indicated by the new thread flag TIF_SVE.  Otherwise, the
FPSIMD view of the architectural state is stored in
thread.fpsimd_state as usual.

When in use, the SVE registers are not stored directly in
thread_struct due to their potentially large and variable size.
Because the task_struct slab allocator must be configured very
early during kernel boot, it is also tricky to configure it
correctly to match the maximum vector length provided by the
hardware, since this depends on examining secondary CPUs as well as
the primary.  Instead, a pointer sve_state in thread_struct points
to a dynamically allocated buffer containing the SVE register data,
and code is added to allocate and free this buffer at appropriate
times.

TIF_SVE is set when taking an SVE access trap from userspace, if
suitable hardware support has been detected.  This enables SVE for
the thread: a subsequent return to userspace will disable the trap
accordingly.  If such a trap is taken without sufficient system-
wide hardware support, SIGILL is sent to the thread instead as if
an undefined instruction had been executed: this may happen if
userspace tries to use SVE in a system where not all CPUs support
it for example.

The kernel will clear TIF_SVE and disable SVE for the thread
whenever an explicit syscall is made by userspace.  For backwards
compatibility reasons and conformance with the spirit of the base
AArch64 procedure call standard, the subset of the SVE register
state that aliases the FPSIMD registers is still preserved across a
syscall even if this happens.  The remainder of the SVE register
state logically becomes zero at syscall entry, though the actual
zeroing work is currently deferred until the thread next tries to
use SVE, causing another trap to the kernel.  This implementation
is suboptimal: in the future, the fastpath case may be optimised
to zero the registers in-place and leave SVE enabled for the task,
where beneficial.

TIF_SVE is also cleared in the following slowpath cases, which are
taken as reasonable hints that the task may no longer use SVE:
 * exec
 * fork and clone

Code is added to sync data between thread.fpsimd_state and
thread.sve_state whenever enabling/disabling SVE, in a manner
consistent with the SVE architectural programmer's model.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Alex Bennée &lt;alex.bennee@linaro.org&gt;
[will: added #include to fix allnoconfig build]
[will: use enable_daif in do_sve_acc]
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds the core support for switching and managing the SVE
architectural state of user tasks.

Calls to the existing FPSIMD low-level save/restore functions are
factored out as new functions task_fpsimd_{save,load}(), since SVE
now dynamically may or may not need to be handled at these points
depending on the kernel configuration, hardware features discovered
at boot, and the runtime state of the task.  To make these
decisions as fast as possible, const cpucaps are used where
feasible, via the system_supports_sve() helper.

The SVE registers are only tracked for threads that have explicitly
used SVE, indicated by the new thread flag TIF_SVE.  Otherwise, the
FPSIMD view of the architectural state is stored in
thread.fpsimd_state as usual.

When in use, the SVE registers are not stored directly in
thread_struct due to their potentially large and variable size.
Because the task_struct slab allocator must be configured very
early during kernel boot, it is also tricky to configure it
correctly to match the maximum vector length provided by the
hardware, since this depends on examining secondary CPUs as well as
the primary.  Instead, a pointer sve_state in thread_struct points
to a dynamically allocated buffer containing the SVE register data,
and code is added to allocate and free this buffer at appropriate
times.

TIF_SVE is set when taking an SVE access trap from userspace, if
suitable hardware support has been detected.  This enables SVE for
the thread: a subsequent return to userspace will disable the trap
accordingly.  If such a trap is taken without sufficient system-
wide hardware support, SIGILL is sent to the thread instead as if
an undefined instruction had been executed: this may happen if
userspace tries to use SVE in a system where not all CPUs support
it for example.

The kernel will clear TIF_SVE and disable SVE for the thread
whenever an explicit syscall is made by userspace.  For backwards
compatibility reasons and conformance with the spirit of the base
AArch64 procedure call standard, the subset of the SVE register
state that aliases the FPSIMD registers is still preserved across a
syscall even if this happens.  The remainder of the SVE register
state logically becomes zero at syscall entry, though the actual
zeroing work is currently deferred until the thread next tries to
use SVE, causing another trap to the kernel.  This implementation
is suboptimal: in the future, the fastpath case may be optimised
to zero the registers in-place and leave SVE enabled for the task,
where beneficial.

TIF_SVE is also cleared in the following slowpath cases, which are
taken as reasonable hints that the task may no longer use SVE:
 * exec
 * fork and clone

Code is added to sync data between thread.fpsimd_state and
thread.sve_state whenever enabling/disabling SVE, in a manner
consistent with the SVE architectural programmer's model.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Alex Bennée &lt;alex.bennee@linaro.org&gt;
[will: added #include to fix allnoconfig build]
[will: use enable_daif in do_sve_acc]
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: entry.S: move SError handling into a C function for future expansion</title>
<updated>2017-11-02T15:55:41+00:00</updated>
<author>
<name>Xie XiuQi</name>
<email>xiexiuqi@huawei.com</email>
</author>
<published>2017-11-02T12:12:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a92d4d1454ab8b43b80b89fa31fcedb8821f8164'/>
<id>a92d4d1454ab8b43b80b89fa31fcedb8821f8164</id>
<content type='text'>
Today SError is taken using the inv_entry macro that ends up in
bad_mode.

SError can be used by the RAS Extensions to notify either the OS or
firmware of CPU problems, some of which may have been corrected.

To allow this handling to be added, add a do_serror() C function
that just panic()s. Add the entry.S boiler plate to save/restore the
CPU registers and unmask debug exceptions. Future patches may change
do_serror() to return if the SError Interrupt was notification of a
corrected error.

Signed-off-by: Xie XiuQi &lt;xiexiuqi@huawei.com&gt;
Signed-off-by: Wang Xiongfeng &lt;wangxiongfengi2@huawei.com&gt;
[Split out of a bigger patch, added compat path, renamed, enabled debug
 exceptions]
Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Today SError is taken using the inv_entry macro that ends up in
bad_mode.

SError can be used by the RAS Extensions to notify either the OS or
firmware of CPU problems, some of which may have been corrected.

To allow this handling to be added, add a do_serror() C function
that just panic()s. Add the entry.S boiler plate to save/restore the
CPU registers and unmask debug exceptions. Future patches may change
do_serror() to return if the SError Interrupt was notification of a
corrected error.

Signed-off-by: Xie XiuQi &lt;xiexiuqi@huawei.com&gt;
Signed-off-by: Wang Xiongfeng &lt;wangxiongfengi2@huawei.com&gt;
[Split out of a bigger patch, added compat path, renamed, enabled debug
 exceptions]
Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: entry.S: convert elX_irq</title>
<updated>2017-11-02T15:55:41+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2017-11-02T12:12:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b282e1ce29bb677224ba8fb38e94f5e94e2656d5'/>
<id>b282e1ce29bb677224ba8fb38e94f5e94e2656d5</id>
<content type='text'>
Following our 'dai' order, irqs should be processed with debug and
serror exceptions unmasked.

Add a helper to unmask these two, (and fiq for good measure).

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Following our 'dai' order, irqs should be processed with debug and
serror exceptions unmasked.

Add a helper to unmask these two, (and fiq for good measure).

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: entry.S convert el0_sync</title>
<updated>2017-11-02T15:55:41+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2017-11-02T12:12:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=746647c75afb5a1706426c2563ff02884a15530d'/>
<id>746647c75afb5a1706426c2563ff02884a15530d</id>
<content type='text'>
el0_sync also unmasks exceptions on a case-by-case basis, debug exceptions
are enabled, unless this was a debug exception. Irqs are unmasked for
some exception types but not for others.

el0_dbg should run with everything masked to prevent us taking a debug
exception from do_debug_exception. For the other cases we can unmask
everything. This changes the behaviour of fpsimd_{acc,exc} and el0_inv
which previously ran with irqs masked.

This patch removed the last user of enable_dbg_and_irq, remove it.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
el0_sync also unmasks exceptions on a case-by-case basis, debug exceptions
are enabled, unless this was a debug exception. Irqs are unmasked for
some exception types but not for others.

el0_dbg should run with everything masked to prevent us taking a debug
exception from do_debug_exception. For the other cases we can unmask
everything. This changes the behaviour of fpsimd_{acc,exc} and el0_inv
which previously ran with irqs masked.

This patch removed the last user of enable_dbg_and_irq, remove it.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: entry.S: convert el1_sync</title>
<updated>2017-11-02T15:55:41+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2017-11-02T12:12:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b55a5a1b0a7d4f51b6c8eec0d4d78ace8f5fa2b3'/>
<id>b55a5a1b0a7d4f51b6c8eec0d4d78ace8f5fa2b3</id>
<content type='text'>
el1_sync unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. IRQs are unmasked
for instruction and data aborts only if the interupted context had
irqs unmasked.

Following our 'dai' order, el1_dbg should run with everything masked.
For the other cases we can inherit whatever we interrupted.

Add a macro inherit_daif to set daif based on the interrupted pstate.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
el1_sync unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. IRQs are unmasked
for instruction and data aborts only if the interupted context had
irqs unmasked.

Following our 'dai' order, el1_dbg should run with everything masked.
For the other cases we can inherit whatever we interrupted.

Add a macro inherit_daif to set daif based on the interrupted pstate.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: Mask all exceptions during kernel_exit</title>
<updated>2017-11-02T15:55:41+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2017-11-02T12:12:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8d66772e869e79ffb94eed7492ca3d2267e150e8'/>
<id>8d66772e869e79ffb94eed7492ca3d2267e150e8</id>
<content type='text'>
To take RAS Exceptions as quickly as possible we need to keep SError
unmasked as much as possible. We need to mask it during kernel_exit
as taking an error from this code will overwrite the exception-registers.

Adding a naked 'disable_daif' to kernel_exit causes a performance problem
for micro-benchmarks that do no real work, (e.g. calling getpid() in a
loop). This is because the ret_to_user loop has already masked IRQs so
that the TIF_WORK_MASK thread flags can't change underneath it, adding
disable_daif is an additional self-synchronising operation.

In the future, the RAS APEI code may need to modify the TIF_WORK_MASK
flags from an SError, in which case the ret_to_user loop must mask SError
while it examines the flags.

Disable all exceptions for return to EL1. For return to EL0 get the
ret_to_user loop to leave all exceptions masked once it has done its
work, this avoids an extra pstate-write.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To take RAS Exceptions as quickly as possible we need to keep SError
unmasked as much as possible. We need to mask it during kernel_exit
as taking an error from this code will overwrite the exception-registers.

Adding a naked 'disable_daif' to kernel_exit causes a performance problem
for micro-benchmarks that do no real work, (e.g. calling getpid() in a
loop). This is because the ret_to_user loop has already masked IRQs so
that the TIF_WORK_MASK thread flags can't change underneath it, adding
disable_daif is an additional self-synchronising operation.

In the future, the RAS APEI code may need to modify the TIF_WORK_MASK
flags from an SError, in which case the ret_to_user loop must mask SError
while it examines the flags.

Disable all exceptions for return to EL1. For return to EL0 get the
ret_to_user loop to leave all exceptions masked once it has done its
work, this avoids an extra pstate-write.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: move TASK_* definitions to &lt;asm/processor.h&gt;</title>
<updated>2017-10-02T09:13:04+00:00</updated>
<author>
<name>Yury Norov</name>
<email>ynorov@caviumnetworks.com</email>
</author>
<published>2017-08-31T08:30:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eef94a3d09aab437c8c254de942d8b1aa76455e2'/>
<id>eef94a3d09aab437c8c254de942d8b1aa76455e2</id>
<content type='text'>
ILP32 series [1] introduces the dependency on &lt;asm/is_compat.h&gt; for
TASK_SIZE macro. Which in turn requires &lt;asm/thread_info.h&gt;, and
&lt;asm/thread_info.h&gt; include &lt;asm/memory.h&gt;, giving a circular dependency,
because TASK_SIZE is currently located in &lt;asm/memory.h&gt;.

In other architectures, TASK_SIZE is defined in &lt;asm/processor.h&gt;, and
moving TASK_SIZE there fixes the problem.

Discussion: https://patchwork.kernel.org/patch/9929107/

[1] https://github.com/norov/linux/tree/ilp32-next

CC: Will Deacon &lt;will.deacon@arm.com&gt;
CC: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Suggested-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Yury Norov &lt;ynorov@caviumnetworks.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ILP32 series [1] introduces the dependency on &lt;asm/is_compat.h&gt; for
TASK_SIZE macro. Which in turn requires &lt;asm/thread_info.h&gt;, and
&lt;asm/thread_info.h&gt; include &lt;asm/memory.h&gt;, giving a circular dependency,
because TASK_SIZE is currently located in &lt;asm/memory.h&gt;.

In other architectures, TASK_SIZE is defined in &lt;asm/processor.h&gt;, and
moving TASK_SIZE there fixes the problem.

Discussion: https://patchwork.kernel.org/patch/9929107/

[1] https://github.com/norov/linux/tree/ilp32-next

CC: Will Deacon &lt;will.deacon@arm.com&gt;
CC: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Suggested-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Yury Norov &lt;ynorov@caviumnetworks.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core</title>
<updated>2017-08-15T17:40:58+00:00</updated>
<author>
<name>Catalin Marinas</name>
<email>catalin.marinas@arm.com</email>
</author>
<published>2017-08-15T17:40:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=df5b95bee1ed7009a2090e9924e7a96e14850d56'/>
<id>df5b95bee1ed7009a2090e9924e7a96e14850d56</id>
<content type='text'>
* 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux:
  arm64: add VMAP_STACK overflow detection
  arm64: add on_accessible_stack()
  arm64: add basic VMAP_STACK support
  arm64: use an irq stack pointer
  arm64: assembler: allow adr_this_cpu to use the stack pointer
  arm64: factor out entry stack manipulation
  efi/arm64: add EFI_KIMG_ALIGN
  arm64: move SEGMENT_ALIGN to &lt;asm/memory.h&gt;
  arm64: clean up irq stack definitions
  arm64: clean up THREAD_* definitions
  arm64: factor out PAGE_* and CONT_* definitions
  arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
  fork: allow arch-override of VMAP stack alignment
  arm64: remove __die()'s stack dump
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux:
  arm64: add VMAP_STACK overflow detection
  arm64: add on_accessible_stack()
  arm64: add basic VMAP_STACK support
  arm64: use an irq stack pointer
  arm64: assembler: allow adr_this_cpu to use the stack pointer
  arm64: factor out entry stack manipulation
  efi/arm64: add EFI_KIMG_ALIGN
  arm64: move SEGMENT_ALIGN to &lt;asm/memory.h&gt;
  arm64: clean up irq stack definitions
  arm64: clean up THREAD_* definitions
  arm64: factor out PAGE_* and CONT_* definitions
  arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
  fork: allow arch-override of VMAP stack alignment
  arm64: remove __die()'s stack dump
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: add VMAP_STACK overflow detection</title>
<updated>2017-08-15T17:36:18+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2017-07-14T19:30:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=872d8327ce8982883b8237b2c320c8666f14e561'/>
<id>872d8327ce8982883b8237b2c320c8666f14e561</id>
<content type='text'>
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.

Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).

Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.

The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.

This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:

[  305.388749] lkdtm: Performing direct entry OVERFLOW
[  305.395444] Insufficient stack space to handle exception!
[  305.395482] ESR: 0x96000047 -- DABT (current EL)
[  305.399890] FAR: 0xffff00000a5e7f30
[  305.401315] Task stack:     [0xffff00000a5e8000..0xffff00000a5ec000]
[  305.403815] IRQ stack:      [0xffff000008000000..0xffff000008004000]
[  305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[  305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[  305.412785] Hardware name: linux,dummy-virt (DT)
[  305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[  305.419221] PC is at recursive_loop+0x10/0x48
[  305.421637] LR is at recursive_loop+0x38/0x48
[  305.423768] pc : [&lt;ffff00000859f330&gt;] lr : [&lt;ffff00000859f358&gt;] pstate: 40000145
[  305.428020] sp : ffff00000a5e7f50
[  305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[  305.433191] x27: ffff000008981000 x26: ffff000008f80400
[  305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[  305.440369] x23: ffff000008f80138 x22: 0000000000000009
[  305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[  305.444552] x19: 0000000000000013 x18: 0000000000000006
[  305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[  305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[  305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[  305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[  305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[  305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[  305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[  305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[  305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[  305.467724] Kernel panic - not syncing: kernel stack overflow
[  305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[  305.473325] Hardware name: linux,dummy-virt (DT)
[  305.475070] Call trace:
[  305.476116] [&lt;ffff000008088ad8&gt;] dump_backtrace+0x0/0x378
[  305.478991] [&lt;ffff000008088e64&gt;] show_stack+0x14/0x20
[  305.481237] [&lt;ffff00000895a178&gt;] dump_stack+0x98/0xb8
[  305.483294] [&lt;ffff0000080c3288&gt;] panic+0x118/0x280
[  305.485673] [&lt;ffff0000080c2e9c&gt;] nmi_panic+0x6c/0x70
[  305.486216] [&lt;ffff000008089710&gt;] handle_bad_stack+0x118/0x128
[  305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[  305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[  305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[  305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[  305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[  305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[  305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[  305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[  305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[  305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[  305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[  305.493063] [&lt;ffff00000808205c&gt;] __bad_stack+0x88/0x8c
[  305.493396] [&lt;ffff00000859f330&gt;] recursive_loop+0x10/0x48
[  305.493731] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.494088] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.494425] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.494649] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.494898] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.495205] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.495453] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.495708] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.496000] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.496302] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.496644] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.496894] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.497138] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.497325] [&lt;ffff00000859f3dc&gt;] lkdtm_OVERFLOW+0x14/0x20
[  305.497506] [&lt;ffff00000859f314&gt;] lkdtm_do_action+0x1c/0x28
[  305.497786] [&lt;ffff00000859f178&gt;] direct_entry+0xe0/0x170
[  305.498095] [&lt;ffff000008345568&gt;] full_proxy_write+0x60/0xa8
[  305.498387] [&lt;ffff0000081fb7f4&gt;] __vfs_write+0x1c/0x128
[  305.498679] [&lt;ffff0000081fcc68&gt;] vfs_write+0xa0/0x1b0
[  305.498926] [&lt;ffff0000081fe0fc&gt;] SyS_write+0x44/0xa0
[  305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[  305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[  305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[  305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[  305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[  305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[  305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[  305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[  305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[  305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[  305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  305.503680] [&lt;ffff000008082fb0&gt;] el0_svc_naked+0x24/0x28
[  305.504720] Kernel Offset: disabled
[  305.505189] CPU features: 0x002082
[  305.505473] Memory Limit: none
[  305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow

This patch was co-authored by Ard Biesheuvel and Mark Rutland.

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Reviewed-by: Will Deacon &lt;will.deacon@arm.com&gt;
Tested-by: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds stack overflow detection to arm64, usable when vmap'd stacks
are in use.

Overflow is detected in a small preamble executed for each exception entry,
which checks whether there is enough space on the current stack for the general
purpose registers to be saved. If there is not enough space, the overflow
handler is invoked on a per-cpu overflow stack. This approach preserves the
original exception information in ESR_EL1 (and where appropriate, FAR_EL1).

Task and IRQ stacks are aligned to double their size, enabling overflow to be
detected with a single bit test. For example, a 16K stack is aligned to 32K,
ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
this bit is flipped. Thus, overflow (of less than the size of the stack) can be
detected by testing whether this bit is set.

The overflow check is performed before any attempt is made to access the
stack, avoiding recursive faults (and the loss of exception information
these would entail). As logical operations cannot be performed on the SP
directly, the SP is temporarily swapped with a general purpose register
using arithmetic operations to enable the test to be performed.

This gives us a useful error message on stack overflow, as can be trigger with
the LKDTM overflow test:

[  305.388749] lkdtm: Performing direct entry OVERFLOW
[  305.395444] Insufficient stack space to handle exception!
[  305.395482] ESR: 0x96000047 -- DABT (current EL)
[  305.399890] FAR: 0xffff00000a5e7f30
[  305.401315] Task stack:     [0xffff00000a5e8000..0xffff00000a5ec000]
[  305.403815] IRQ stack:      [0xffff000008000000..0xffff000008004000]
[  305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
[  305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[  305.412785] Hardware name: linux,dummy-virt (DT)
[  305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
[  305.419221] PC is at recursive_loop+0x10/0x48
[  305.421637] LR is at recursive_loop+0x38/0x48
[  305.423768] pc : [&lt;ffff00000859f330&gt;] lr : [&lt;ffff00000859f358&gt;] pstate: 40000145
[  305.428020] sp : ffff00000a5e7f50
[  305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
[  305.433191] x27: ffff000008981000 x26: ffff000008f80400
[  305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
[  305.440369] x23: ffff000008f80138 x22: 0000000000000009
[  305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
[  305.444552] x19: 0000000000000013 x18: 0000000000000006
[  305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
[  305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
[  305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
[  305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
[  305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
[  305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
[  305.459285] x5 : 0000000000000000 x4 : 0000000000000000
[  305.461781] x3 : 0000000000000000 x2 : 0000000000000400
[  305.465119] x1 : 0000000000000013 x0 : 0000000000000012
[  305.467724] Kernel panic - not syncing: kernel stack overflow
[  305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
[  305.473325] Hardware name: linux,dummy-virt (DT)
[  305.475070] Call trace:
[  305.476116] [&lt;ffff000008088ad8&gt;] dump_backtrace+0x0/0x378
[  305.478991] [&lt;ffff000008088e64&gt;] show_stack+0x14/0x20
[  305.481237] [&lt;ffff00000895a178&gt;] dump_stack+0x98/0xb8
[  305.483294] [&lt;ffff0000080c3288&gt;] panic+0x118/0x280
[  305.485673] [&lt;ffff0000080c2e9c&gt;] nmi_panic+0x6c/0x70
[  305.486216] [&lt;ffff000008089710&gt;] handle_bad_stack+0x118/0x128
[  305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
[  305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
[  305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
[  305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
[  305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
[  305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
[  305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
[  305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
[  305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
[  305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
[  305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
[  305.493063] [&lt;ffff00000808205c&gt;] __bad_stack+0x88/0x8c
[  305.493396] [&lt;ffff00000859f330&gt;] recursive_loop+0x10/0x48
[  305.493731] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.494088] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.494425] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.494649] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.494898] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.495205] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.495453] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.495708] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.496000] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.496302] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.496644] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.496894] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.497138] [&lt;ffff00000859f358&gt;] recursive_loop+0x38/0x48
[  305.497325] [&lt;ffff00000859f3dc&gt;] lkdtm_OVERFLOW+0x14/0x20
[  305.497506] [&lt;ffff00000859f314&gt;] lkdtm_do_action+0x1c/0x28
[  305.497786] [&lt;ffff00000859f178&gt;] direct_entry+0xe0/0x170
[  305.498095] [&lt;ffff000008345568&gt;] full_proxy_write+0x60/0xa8
[  305.498387] [&lt;ffff0000081fb7f4&gt;] __vfs_write+0x1c/0x128
[  305.498679] [&lt;ffff0000081fcc68&gt;] vfs_write+0xa0/0x1b0
[  305.498926] [&lt;ffff0000081fe0fc&gt;] SyS_write+0x44/0xa0
[  305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
[  305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[  305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
[  305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
[  305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
[  305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
[  305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
[  305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
[  305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
[  305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
[  305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  305.503680] [&lt;ffff000008082fb0&gt;] el0_svc_naked+0x24/0x28
[  305.504720] Kernel Offset: disabled
[  305.505189] CPU features: 0x002082
[  305.505473] Memory Limit: none
[  305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow

This patch was co-authored by Ard Biesheuvel and Mark Rutland.

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Reviewed-by: Will Deacon &lt;will.deacon@arm.com&gt;
Tested-by: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
