<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/arm64/kernel/fpsimd.c, branch v4.15</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>arm64: fpsimd: Fix copying of FP state from signal frame into task struct</title>
<updated>2017-12-15T16:12:35+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2017-12-15T16:07:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a4544831370618cb3627e27ffcc27d1cc857868f'/>
<id>a4544831370618cb3627e27ffcc27d1cc857868f</id>
<content type='text'>
Commit 9de52a755cfb6da5 ("arm64: fpsimd: Fix failure to restore FPSIMD
state after signals") fixed an issue reported in our FPSIMD signal
restore code but inadvertently introduced another issue which tends to
manifest as random SEGVs in userspace.

The problem is that when we copy the struct fpsimd_state from the kernel
stack (populated from the signal frame) into the struct held in the
current thread_struct, we blindly copy uninitialised stack into the
"cpu" field, which means that context-switching of the FP registers is
no longer reliable.

This patch fixes the problem by copying only the user_fpsimd member of
struct fpsimd_state. We should really rework the function prototypes
to take struct user_fpsimd_state * instead, but let's just get this
fixed for now.

Cc: Dave Martin &lt;Dave.Martin@arm.com&gt;
Fixes: 9de52a755cfb6da5 ("arm64: fpsimd: Fix failure to restore FPSIMD state after signals")
Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 9de52a755cfb6da5 ("arm64: fpsimd: Fix failure to restore FPSIMD
state after signals") fixed an issue reported in our FPSIMD signal
restore code but inadvertently introduced another issue which tends to
manifest as random SEGVs in userspace.

The problem is that when we copy the struct fpsimd_state from the kernel
stack (populated from the signal frame) into the struct held in the
current thread_struct, we blindly copy uninitialised stack into the
"cpu" field, which means that context-switching of the FP registers is
no longer reliable.

This patch fixes the problem by copying only the user_fpsimd member of
struct fpsimd_state. We should really rework the function prototypes
to take struct user_fpsimd_state * instead, but let's just get this
fixed for now.

Cc: Dave Martin &lt;Dave.Martin@arm.com&gt;
Fixes: 9de52a755cfb6da5 ("arm64: fpsimd: Fix failure to restore FPSIMD state after signals")
Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/sve: Avoid dereference of dead task_struct in KVM guest entry</title>
<updated>2017-12-06T19:08:05+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-12-06T16:45:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cb968afc789821cdf9e17e79ef08ab90e5bae0f2'/>
<id>cb968afc789821cdf9e17e79ef08ab90e5bae0f2</id>
<content type='text'>
When deciding whether to invalidate FPSIMD state cached in the cpu,
the backend function sve_flush_cpu_state() attempts to dereference
__this_cpu_read(fpsimd_last_state).  However, this is not safe:
there is no guarantee that this task_struct pointer is still valid,
because the task could have exited in the meantime.

This means that we need another means to get the appropriate value
of TIF_SVE for the associated task.

This patch solves this issue by adding a cached copy of the TIF_SVE
flag in fpsimd_last_state, which we can check without dereferencing
the task pointer.

In particular, although this patch is not a KVM fix per se, this
means that this check is now done safely in the KVM world switch
path (which is currently the only user of this code).

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Cc: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When deciding whether to invalidate FPSIMD state cached in the cpu,
the backend function sve_flush_cpu_state() attempts to dereference
__this_cpu_read(fpsimd_last_state).  However, this is not safe:
there is no guarantee that this task_struct pointer is still valid,
because the task could have exited in the meantime.

This means that we need another means to get the appropriate value
of TIF_SVE for the associated task.

This patch solves this issue by adding a cached copy of the TIF_SVE
flag in fpsimd_last_state, which we can check without dereferencing
the task pointer.

In particular, although this patch is not a KVM fix per se, this
means that this check is now done safely in the KVM world switch
path (which is currently the only user of this code).

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Cc: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: fpsimd: Abstract out binding of task's fpsimd context to the cpu.</title>
<updated>2017-12-06T18:28:10+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-12-06T16:45:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8884b7bd7e52de20a801c5f457954ed212c0f625'/>
<id>8884b7bd7e52de20a801c5f457954ed212c0f625</id>
<content type='text'>
There is currently some duplicate logic to associate current's
FPSIMD context with the cpu when loading FPSIMD state into the cpu
regs.

Subsequent patches will update that logic, so in order to ensure it
only needs to be done in one place, this patch factors the relevant
code out into a new function fpsimd_bind_to_cpu().

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is currently some duplicate logic to associate current's
FPSIMD context with the cpu when loading FPSIMD state into the cpu
regs.

Subsequent patches will update that logic, so in order to ensure it
only needs to be done in one place, this patch factors the relevant
code out into a new function fpsimd_bind_to_cpu().

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: fpsimd: Fix failure to restore FPSIMD state after signals</title>
<updated>2017-12-01T13:05:05+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-11-30T11:56:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9de52a755cfb6da5ee21a07e3a868bdc8fbfccb3'/>
<id>9de52a755cfb6da5ee21a07e3a868bdc8fbfccb3</id>
<content type='text'>
The fpsimd_update_current_state() function is responsible for
loading the FPSIMD state from the user signal frame into the
current task during sigreturn.  When implementing support for SVE,
conditional code was added to this function in order to handle the
case where SVE state need to be loaded for the task and merged with
the FPSIMD data from the signal frame; however, the FPSIMD-only
case was unintentionally dropped.

As a result of this, sigreturn does not currently restore the
FPSIMD state of the task, except in the case where the system
supports SVE and the signal frame contains SVE state in addition to
FPSIMD state.

This patch fixes this bug by making the copy-in of the FPSIMD data
from the signal frame to thread_struct unconditional.

This remains a performance regression from v4.14, since the FPSIMD
state is now copied into thread_struct and then loaded back,
instead of _only_ being loaded into the CPU FPSIMD registers.
However, it is essential to call task_fpsimd_load() here anyway in
order to ensure that the SVE enable bit in CPACR_EL1 is set
correctly before returning to userspace.  This could use some
refactoring, but since sigreturn is not a fast path I have kept
this patch as a pure fix and left the refactoring for later.

Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Reported-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Tested-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Reviewed-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The fpsimd_update_current_state() function is responsible for
loading the FPSIMD state from the user signal frame into the
current task during sigreturn.  When implementing support for SVE,
conditional code was added to this function in order to handle the
case where SVE state need to be loaded for the task and merged with
the FPSIMD data from the signal frame; however, the FPSIMD-only
case was unintentionally dropped.

As a result of this, sigreturn does not currently restore the
FPSIMD state of the task, except in the case where the system
supports SVE and the signal frame contains SVE state in addition to
FPSIMD state.

This patch fixes this bug by making the copy-in of the FPSIMD data
from the signal frame to thread_struct unconditional.

This remains a performance regression from v4.14, since the FPSIMD
state is now copied into thread_struct and then loaded back,
instead of _only_ being loaded into the CPU FPSIMD registers.
However, it is essential to call task_fpsimd_load() here anyway in
order to ensure that the SVE enable bit in CPACR_EL1 is set
correctly before returning to userspace.  This could use some
refactoring, but since sigreturn is not a fast path I have kept
this patch as a pure fix and left the refactoring for later.

Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Reported-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Tested-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Reviewed-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux</title>
<updated>2017-11-15T18:56:56+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-11-15T18:56:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c9b012e5f4a1d01dfa8abc6318211a67ba7d5db2'/>
<id>c9b012e5f4a1d01dfa8abc6318211a67ba7d5db2</id>
<content type='text'>
Pull arm64 updates from Will Deacon:
 "The big highlight is support for the Scalable Vector Extension (SVE)
  which required extensive ABI work to ensure we don't break existing
  applications by blowing away their signal stack with the rather large
  new vector context (&lt;= 2 kbit per vector register). There's further
  work to be done optimising things like exception return, but the ABI
  is solid now.

  Much of the line count comes from some new PMU drivers we have, but
  they're pretty self-contained and I suspect we'll have more of them in
  future.

  Plenty of acronym soup here:

   - initial support for the Scalable Vector Extension (SVE)

   - improved handling for SError interrupts (required to handle RAS
     events)

   - enable GCC support for 128-bit integer types

   - remove kernel text addresses from backtraces and register dumps

   - use of WFE to implement long delay()s

   - ACPI IORT updates from Lorenzo Pieralisi

   - perf PMU driver for the Statistical Profiling Extension (SPE)

   - perf PMU driver for Hisilicon's system PMUs

   - misc cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
  arm64: Make ARMV8_DEPRECATED depend on SYSCTL
  arm64: Implement __lshrti3 library function
  arm64: support __int128 on gcc 5+
  arm64/sve: Add documentation
  arm64/sve: Detect SVE and activate runtime support
  arm64/sve: KVM: Hide SVE from CPU features exposed to guests
  arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
  arm64/sve: KVM: Prevent guests from using SVE
  arm64/sve: Add sysctl to set the default vector length for new processes
  arm64/sve: Add prctl controls for userspace vector length management
  arm64/sve: ptrace and ELF coredump support
  arm64/sve: Preserve SVE registers around EFI runtime service calls
  arm64/sve: Preserve SVE registers around kernel-mode NEON use
  arm64/sve: Probe SVE capabilities and usable vector lengths
  arm64: cpufeature: Move sys_caps_initialised declarations
  arm64/sve: Backend logic for setting the vector length
  arm64/sve: Signal handling support
  arm64/sve: Support vector length resetting for new processes
  arm64/sve: Core task context handling
  arm64/sve: Low-level CPU setup
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull arm64 updates from Will Deacon:
 "The big highlight is support for the Scalable Vector Extension (SVE)
  which required extensive ABI work to ensure we don't break existing
  applications by blowing away their signal stack with the rather large
  new vector context (&lt;= 2 kbit per vector register). There's further
  work to be done optimising things like exception return, but the ABI
  is solid now.

  Much of the line count comes from some new PMU drivers we have, but
  they're pretty self-contained and I suspect we'll have more of them in
  future.

  Plenty of acronym soup here:

   - initial support for the Scalable Vector Extension (SVE)

   - improved handling for SError interrupts (required to handle RAS
     events)

   - enable GCC support for 128-bit integer types

   - remove kernel text addresses from backtraces and register dumps

   - use of WFE to implement long delay()s

   - ACPI IORT updates from Lorenzo Pieralisi

   - perf PMU driver for the Statistical Profiling Extension (SPE)

   - perf PMU driver for Hisilicon's system PMUs

   - misc cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
  arm64: Make ARMV8_DEPRECATED depend on SYSCTL
  arm64: Implement __lshrti3 library function
  arm64: support __int128 on gcc 5+
  arm64/sve: Add documentation
  arm64/sve: Detect SVE and activate runtime support
  arm64/sve: KVM: Hide SVE from CPU features exposed to guests
  arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
  arm64/sve: KVM: Prevent guests from using SVE
  arm64/sve: Add sysctl to set the default vector length for new processes
  arm64/sve: Add prctl controls for userspace vector length management
  arm64/sve: ptrace and ELF coredump support
  arm64/sve: Preserve SVE registers around EFI runtime service calls
  arm64/sve: Preserve SVE registers around kernel-mode NEON use
  arm64/sve: Probe SVE capabilities and usable vector lengths
  arm64: cpufeature: Move sys_caps_initialised declarations
  arm64/sve: Backend logic for setting the vector length
  arm64/sve: Signal handling support
  arm64/sve: Support vector length resetting for new processes
  arm64/sve: Core task context handling
  arm64/sve: Low-level CPU setup
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/sve: KVM: Prevent guests from using SVE</title>
<updated>2017-11-03T15:24:19+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-10-31T15:51:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=17eed27b02da88560b4592390952b9a71042ab8b'/>
<id>17eed27b02da88560b4592390952b9a71042ab8b</id>
<content type='text'>
Until KVM has full SVE support, guests must not be allowed to
execute SVE instructions.

This patch enables the necessary traps, and also ensures that the
traps are disabled again on exit from the guest so that the host
can still use SVE if it wants to.

On guest exit, high bits of the SVE Zn registers may have been
clobbered as a side-effect the execution of FPSIMD instructions in
the guest.  The existing KVM host FPSIMD restore code is not
sufficient to restore these bits, so this patch explicitly marks
the CPU as not containing cached vector state for any task, thus
forcing a reload on the next return to userspace.  This is an
interim measure, in advance of adding full SVE awareness to KVM.

This marking of cached vector state in the CPU as invalid is done
using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c.  Due
to the repeated use of this rather obscure operation, it makes
sense to factor it out as a separate helper with a clearer name.
This patch factors it out as fpsimd_flush_cpu_state(), and ports
all callers to use it.

As a side effect of this refactoring, a this_cpu_write() in
fpsimd_cpu_pm_notifier() is changed to __this_cpu_write().  This
should be fine, since cpu_pm_enter() is supposed to be called only
with interrupts disabled.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Reviewed-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Acked-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Until KVM has full SVE support, guests must not be allowed to
execute SVE instructions.

This patch enables the necessary traps, and also ensures that the
traps are disabled again on exit from the guest so that the host
can still use SVE if it wants to.

On guest exit, high bits of the SVE Zn registers may have been
clobbered as a side-effect the execution of FPSIMD instructions in
the guest.  The existing KVM host FPSIMD restore code is not
sufficient to restore these bits, so this patch explicitly marks
the CPU as not containing cached vector state for any task, thus
forcing a reload on the next return to userspace.  This is an
interim measure, in advance of adding full SVE awareness to KVM.

This marking of cached vector state in the CPU as invalid is done
using __this_cpu_write(fpsimd_last_state, NULL) in fpsimd.c.  Due
to the repeated use of this rather obscure operation, it makes
sense to factor it out as a separate helper with a clearer name.
This patch factors it out as fpsimd_flush_cpu_state(), and ports
all callers to use it.

As a side effect of this refactoring, a this_cpu_write() in
fpsimd_cpu_pm_notifier() is changed to __this_cpu_write().  This
should be fine, since cpu_pm_enter() is supposed to be called only
with interrupts disabled.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Reviewed-by: Christoffer Dall &lt;christoffer.dall@linaro.org&gt;
Acked-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Acked-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/sve: Add sysctl to set the default vector length for new processes</title>
<updated>2017-11-03T15:24:19+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-10-31T15:51:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4ffa09a939ab6d95655b3aee6ff79de48df95be7'/>
<id>4ffa09a939ab6d95655b3aee6ff79de48df95be7</id>
<content type='text'>
Because of the effect of SVE on the size of the signal frame, the
default vector length used for new processes involves a tradeoff
between performance of SVE-enabled software on the one hand, and
reliability of non-SVE-aware software on the other hand.

For this reason, the best choice depends on the repertoire of
userspace software in use and is thus best left up to distro
maintainers, sysadmins and developers.

If CONFIG_SYSCTL and CONFIG_PROC_SYSCTL are enabled, this patch
exposes the default vector length in
/proc/sys/abi/sve_default_vector_length, where boot scripts or the
adventurous can poke it.

In common with other arm64 ABI sysctls, this control is currently
global: setting it requires CAP_SYS_ADMIN in the root user
namespace, but the value set is effective for subsequent execs in
all namespaces.  The control only affects _new_ processes, however:
changing it does not affect the vector length of any existing
process.

The intended usage model is that if userspace is known to be fully
SVE-tolerant (or a developer is curious to find out) then this
parameter can be cranked up during system startup.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because of the effect of SVE on the size of the signal frame, the
default vector length used for new processes involves a tradeoff
between performance of SVE-enabled software on the one hand, and
reliability of non-SVE-aware software on the other hand.

For this reason, the best choice depends on the repertoire of
userspace software in use and is thus best left up to distro
maintainers, sysadmins and developers.

If CONFIG_SYSCTL and CONFIG_PROC_SYSCTL are enabled, this patch
exposes the default vector length in
/proc/sys/abi/sve_default_vector_length, where boot scripts or the
adventurous can poke it.

In common with other arm64 ABI sysctls, this control is currently
global: setting it requires CAP_SYS_ADMIN in the root user
namespace, but the value set is effective for subsequent execs in
all namespaces.  The control only affects _new_ processes, however:
changing it does not affect the vector length of any existing
process.

The intended usage model is that if userspace is known to be fully
SVE-tolerant (or a developer is curious to find out) then this
parameter can be cranked up during system startup.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/sve: Add prctl controls for userspace vector length management</title>
<updated>2017-11-03T15:24:19+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-10-31T15:51:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2d2123bc7c7f843aa9db87720de159a049839862'/>
<id>2d2123bc7c7f843aa9db87720de159a049839862</id>
<content type='text'>
This patch adds two arm64-specific prctls, to permit userspace to
control its vector length:

 * PR_SVE_SET_VL: set the thread's SVE vector length and vector
   length inheritance mode.

 * PR_SVE_GET_VL: get the same information.

Although these prctls resemble instruction set features in the SVE
architecture, they provide additional control: the vector length
inheritance mode is Linux-specific and nothing to do with the
architecture, and the architecture does not permit EL0 to set its
own vector length directly.  Both can be used in portable tools
without requiring the use of SVE instructions.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Alex Bennée &lt;alex.bennee@linaro.org&gt;
[will: Fixed up prctl constants to avoid clash with PDEATHSIG]
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds two arm64-specific prctls, to permit userspace to
control its vector length:

 * PR_SVE_SET_VL: set the thread's SVE vector length and vector
   length inheritance mode.

 * PR_SVE_GET_VL: get the same information.

Although these prctls resemble instruction set features in the SVE
architecture, they provide additional control: the vector length
inheritance mode is Linux-specific and nothing to do with the
architecture, and the architecture does not permit EL0 to set its
own vector length directly.  Both can be used in portable tools
without requiring the use of SVE instructions.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Alex Bennée &lt;alex.bennee@linaro.org&gt;
[will: Fixed up prctl constants to avoid clash with PDEATHSIG]
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/sve: ptrace and ELF coredump support</title>
<updated>2017-11-03T15:24:18+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-10-31T15:51:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=43d4da2c45b2f5d62f8a79ff7c6f95089bb24656'/>
<id>43d4da2c45b2f5d62f8a79ff7c6f95089bb24656</id>
<content type='text'>
This patch defines and implements a new regset NT_ARM_SVE, which
describes a thread's SVE register state.  This allows a debugger to
manipulate the SVE state, as well as being included in ELF
coredumps for post-mortem debugging.

Because the regset size and layout are dependent on the thread's
current vector length, it is not possible to define a C struct to
describe the regset contents as is done for existing regsets.
Instead, and for the same reasons, NT_ARM_SVE is based on the
freeform variable-layout approach used for the SVE signal frame.

Additionally, to reduce debug overhead when debugging threads that
might or might not have live SVE register state, NT_ARM_SVE may be
presented in one of two different formats: the old struct
user_fpsimd_state format is embedded for describing the state of a
thread with no live SVE state, whereas a new variable-layout
structure is embedded for describing live SVE state.  This avoids a
debugger needing to poll NT_PRFPREG in addition to NT_ARM_SVE, and
allows existing userspace code to handle the non-SVE case without
too much modification.

For this to work, NT_ARM_SVE is defined with a fixed-format header
of type struct user_sve_header, which the recipient can use to
figure out the content, size and layout of the reset of the regset.
Accessor macros are defined to allow the vector-length-dependent
parts of the regset to be manipulated.

Signed-off-by: Alan Hayward &lt;alan.hayward@arm.com&gt;
Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Cc: Okamoto Takayuki &lt;tokamoto@jp.fujitsu.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch defines and implements a new regset NT_ARM_SVE, which
describes a thread's SVE register state.  This allows a debugger to
manipulate the SVE state, as well as being included in ELF
coredumps for post-mortem debugging.

Because the regset size and layout are dependent on the thread's
current vector length, it is not possible to define a C struct to
describe the regset contents as is done for existing regsets.
Instead, and for the same reasons, NT_ARM_SVE is based on the
freeform variable-layout approach used for the SVE signal frame.

Additionally, to reduce debug overhead when debugging threads that
might or might not have live SVE register state, NT_ARM_SVE may be
presented in one of two different formats: the old struct
user_fpsimd_state format is embedded for describing the state of a
thread with no live SVE state, whereas a new variable-layout
structure is embedded for describing live SVE state.  This avoids a
debugger needing to poll NT_PRFPREG in addition to NT_ARM_SVE, and
allows existing userspace code to handle the non-SVE case without
too much modification.

For this to work, NT_ARM_SVE is defined with a fixed-format header
of type struct user_sve_header, which the recipient can use to
figure out the content, size and layout of the reset of the regset.
Accessor macros are defined to allow the vector-length-dependent
parts of the regset to be manipulated.

Signed-off-by: Alan Hayward &lt;alan.hayward@arm.com&gt;
Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Cc: Okamoto Takayuki &lt;tokamoto@jp.fujitsu.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/sve: Preserve SVE registers around EFI runtime service calls</title>
<updated>2017-11-03T15:24:18+00:00</updated>
<author>
<name>Dave Martin</name>
<email>Dave.Martin@arm.com</email>
</author>
<published>2017-10-31T15:51:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fdfa976cae5cbd46aa4b4f9e554a93c9f8b35c62'/>
<id>fdfa976cae5cbd46aa4b4f9e554a93c9f8b35c62</id>
<content type='text'>
The EFI runtime services ABI allows EFI to make free use of the
FPSIMD registers during EFI runtime service calls, subject to the
callee-save requirements of the AArch64 procedure call standard.

However, the SVE architecture allows upper bits of the SVE vector
registers to be zeroed as a side-effect of FPSIMD V-register
writes.  This means that the SVE vector registers must be saved in
their entirety in order to avoid data loss: non-SVE-aware EFI
implementations cannot restore them correctly.

The non-IRQ case is already handled gracefully by
kernel_neon_begin().  For the IRQ case, this patch allocates a
suitable per-CPU stash buffer for the full SVE register state and
uses it to preserve the affected registers around EFI calls.  It is
currently unclear how the EFI runtime services ABI will be
clarified with respect to SVE, so it safest to assume that the
predicate registers and FFR must be saved and restored too.

No attempt is made to restore the restore the vector length after
a call, for now.  It is deemed rather insane for EFI to change it,
and contemporary EFI implementations certainly won't.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The EFI runtime services ABI allows EFI to make free use of the
FPSIMD registers during EFI runtime service calls, subject to the
callee-save requirements of the AArch64 procedure call standard.

However, the SVE architecture allows upper bits of the SVE vector
registers to be zeroed as a side-effect of FPSIMD V-register
writes.  This means that the SVE vector registers must be saved in
their entirety in order to avoid data loss: non-SVE-aware EFI
implementations cannot restore them correctly.

The non-IRQ case is already handled gracefully by
kernel_neon_begin().  For the IRQ case, this patch allocates a
suitable per-CPU stash buffer for the full SVE register state and
uses it to preserve the affected registers around EFI calls.  It is
currently unclear how the EFI runtime services ABI will be
clarified with respect to SVE, so it safest to assume that the
predicate registers and FFR must be saved and restored too.

No attempt is made to restore the restore the vector length after
a call, for now.  It is deemed rather insane for EFI to change it,
and contemporary EFI implementations certainly won't.

Signed-off-by: Dave Martin &lt;Dave.Martin@arm.com&gt;
Reviewed-by: Alex Bennée &lt;alex.bennee@linaro.org&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
