<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/powerpc/kernel/vector.S, branch v3.13</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>powerpc: Don't corrupt user registers on 32-bit</title>
<updated>2013-10-23T11:34:19+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2013-10-23T08:40:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=955c1cab809edfb5429603c68493363074ac20cf'/>
<id>955c1cab809edfb5429603c68493363074ac20cf</id>
<content type='text'>
Commit de79f7b9f6 ("powerpc: Put FP/VSX and VR state into structures")
modified load_up_fpu() and load_up_altivec() in such a way that they
now use r7 and r8.  Unfortunately, the callers of these functions on
32-bit machines then return to userspace via fast_exception_return,
which doesn't restore all of the volatile GPRs, but only r1, r3 -- r6
and r9 -- r12.  This was causing userspace segfaults and other
userspace misbehaviour on 32-bit machines.

This fixes the problem by changing the register usage of load_up_fpu()
and load_up_altivec() to avoid using r7 and r8 and instead use r6 and
r10.  This also adds comments to those functions saying which registers
may be used.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Tested-by: Scott Wood &lt;scottwood@freescale.com&gt; (on e500mc, so no altivec)
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit de79f7b9f6 ("powerpc: Put FP/VSX and VR state into structures")
modified load_up_fpu() and load_up_altivec() in such a way that they
now use r7 and r8.  Unfortunately, the callers of these functions on
32-bit machines then return to userspace via fast_exception_return,
which doesn't restore all of the volatile GPRs, but only r1, r3 -- r6
and r9 -- r12.  This was causing userspace segfaults and other
userspace misbehaviour on 32-bit machines.

This fixes the problem by changing the register usage of load_up_fpu()
and load_up_altivec() to avoid using r7 and r8 and instead use r6 and
r10.  This also adds comments to those functions saying which registers
may be used.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Tested-by: Scott Wood &lt;scottwood@freescale.com&gt; (on e500mc, so no altivec)
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Provide for giveup_fpu/altivec to save state in alternate location</title>
<updated>2013-10-11T06:26:50+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2013-09-10T10:21:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=18461960cbf50bf345ef0667d45d5f64de8fb893'/>
<id>18461960cbf50bf345ef0667d45d5f64de8fb893</id>
<content type='text'>
This provides a facility which is intended for use by KVM, where the
contents of the FP/VSX and VMX (Altivec) registers can be saved away
to somewhere other than the thread_struct when kernel code wants to
use floating point or VMX instructions.  This is done by providing a
pointer in the thread_struct to indicate where the state should be
saved to.  The giveup_fpu() and giveup_altivec() functions test these
pointers and save state to the indicated location if they are non-NULL.
Note that the MSR_FP/VEC bits in task-&gt;thread.regs-&gt;msr are still used
to indicate whether the CPU register state is live, even when an
alternate save location is being used.

This also provides load_fp_state() and load_vr_state() functions, which
load up FP/VSX and VMX state from memory into the CPU registers, and
corresponding store_fp_state() and store_vr_state() functions, which
store FP/VSX and VMX state into memory from the CPU registers.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This provides a facility which is intended for use by KVM, where the
contents of the FP/VSX and VMX (Altivec) registers can be saved away
to somewhere other than the thread_struct when kernel code wants to
use floating point or VMX instructions.  This is done by providing a
pointer in the thread_struct to indicate where the state should be
saved to.  The giveup_fpu() and giveup_altivec() functions test these
pointers and save state to the indicated location if they are non-NULL.
Note that the MSR_FP/VEC bits in task-&gt;thread.regs-&gt;msr are still used
to indicate whether the CPU register state is live, even when an
alternate save location is being used.

This also provides load_fp_state() and load_vr_state() functions, which
load up FP/VSX and VMX state from memory into the CPU registers, and
corresponding store_fp_state() and store_vr_state() functions, which
store FP/VSX and VMX state into memory from the CPU registers.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Put FP/VSX and VR state into structures</title>
<updated>2013-10-11T06:26:49+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2013-09-10T10:20:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=de79f7b9f6f92ec1bd6f61fa1f20de60728a5b5e'/>
<id>de79f7b9f6f92ec1bd6f61fa1f20de60728a5b5e</id>
<content type='text'>
This creates new 'thread_fp_state' and 'thread_vr_state' structures
to store FP/VSX state (including FPSCR) and Altivec/VSX state
(including VSCR), and uses them in the thread_struct.  In the
thread_fp_state, the FPRs and VSRs are represented as u64 rather
than double, since we rarely perform floating-point computations
on the values, and this will enable the structures to be used
in KVM code as well.  Similarly FPSCR is now a u64 rather than
a structure of two 32-bit values.

This takes the offsets out of the macros such as SAVE_32FPRS,
REST_32FPRS, etc.  This enables the same macros to be used for normal
and transactional state, enabling us to delete the transactional
versions of the macros.   This also removes the unused do_load_up_fpu
and do_load_up_altivec, which were in fact buggy since they didn't
create large enough stack frames to account for the fact that
load_up_fpu and load_up_altivec are not designed to be called from C
and assume that their caller's stack frame is an interrupt frame.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This creates new 'thread_fp_state' and 'thread_vr_state' structures
to store FP/VSX state (including FPSCR) and Altivec/VSX state
(including VSCR), and uses them in the thread_struct.  In the
thread_fp_state, the FPRs and VSRs are represented as u64 rather
than double, since we rarely perform floating-point computations
on the values, and this will enable the structures to be used
in KVM code as well.  Similarly FPSCR is now a u64 rather than
a structure of two 32-bit values.

This takes the offsets out of the macros such as SAVE_32FPRS,
REST_32FPRS, etc.  This enables the same macros to be used for normal
and transactional state, enabling us to delete the transactional
versions of the macros.   This also removes the unused do_load_up_fpu
and do_load_up_altivec, which were in fact buggy since they didn't
create large enough stack frames to account for the fact that
load_up_fpu and load_up_altivec are not designed to be called from C
and assume that their caller's stack frame is an interrupt frame.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Add FP/VSX and VMX register load functions for transactional memory</title>
<updated>2013-02-15T05:58:52+00:00</updated>
<author>
<name>Michael Neuling</name>
<email>mikey@neuling.org</email>
</author>
<published>2013-02-13T16:21:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a2dcbb32f0647a3b8230c146a2f980654a7738af'/>
<id>a2dcbb32f0647a3b8230c146a2f980654a7738af</id>
<content type='text'>
This adds functions to restore the state of the FP/VSX registers from
what's stored in the thread_struct.  Two version for FP/VSX are required
since one restores them from transactional/checkpoint side of the
thread_struct and the other from the speculated side.

Similar functions are added for VMX registers.

Signed-off-by: Matt Evans &lt;matt@ozlabs.org&gt;
Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds functions to restore the state of the FP/VSX registers from
what's stored in the thread_struct.  Two version for FP/VSX are required
since one restores them from transactional/checkpoint side of the
thread_struct and the other from the speculated side.

Similar functions are added for VMX registers.

Signed-off-by: Matt Evans &lt;matt@ozlabs.org&gt;
Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Optimise enable_kernel_altivec</title>
<updated>2012-04-30T05:37:17+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2012-04-15T20:56:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=35000870fcfbb28757ad47de77b4645072d916b8'/>
<id>35000870fcfbb28757ad47de77b4645072d916b8</id>
<content type='text'>
Add two optimisations to enable_kernel_altivec:

- enable_kernel_altivec has already determined if we need to
save the previous task's state but we call giveup_altivec
in both cases, requiring an extra branch in giveup_altivec. Create
giveup_altivec_notask which only turns on the VMX bit in the
MSR.

- We write the VMX MSR bit each time we call enable_kernel_altivec
even it was already set. Check the bit and branch out if we have
already set it. The classic case for this is vectored IO
where we have to copy multiple buffers to or from userspace.

The following testcase was used to confirm this patch improves
performance:

http://ozlabs.org/~anton/junkcode/copy_to_user.c

Since the current breakpoint for using VMX in copy_tofrom_user is
4096 bytes, I'm using buffers of 4096 + 1 cacheline (4224) bytes.
A benchmark of 16 entry readvs (-s 16):

time copy_to_user -l 4224 -s 16 -i 1000000

completes 5.2% faster on a POWER7 PS700.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add two optimisations to enable_kernel_altivec:

- enable_kernel_altivec has already determined if we need to
save the previous task's state but we call giveup_altivec
in both cases, requiring an extra branch in giveup_altivec. Create
giveup_altivec_notask which only turns on the VMX bit in the
MSR.

- We write the VMX MSR bit each time we call enable_kernel_altivec
even it was already set. Check the bit and branch out if we have
already set it. The classic case for this is vectored IO
where we have to copy multiple buffers to or from userspace.

The following testcase was used to confirm this patch improves
performance:

http://ozlabs.org/~anton/junkcode/copy_to_user.c

Since the current breakpoint for using VMX in copy_tofrom_user is
4096 bytes, I'm using buffers of 4096 + 1 cacheline (4224) bytes.
A benchmark of 16 entry readvs (-s 16):

time copy_to_user -l 4224 -s 16 -i 1000000

completes 5.2% faster on a POWER7 PS700.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Remove static branch hint in giveup_altivec</title>
<updated>2011-05-19T04:30:42+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2011-05-08T21:20:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ba00ce1d6e08ad06f19f2ac53fd5c60bbe3fbeeb'/>
<id>ba00ce1d6e08ad06f19f2ac53fd5c60bbe3fbeeb</id>
<content type='text'>
A static branch hint will override dynamic branch prediction on
recent POWER CPUs. Since we are about to use more altivec in the
kernel remove the static hint in giveup_altivec that assumes
a userspace task is using altivec.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A static branch hint will override dynamic branch prediction on
recent POWER CPUs. Since we are about to use more altivec in the
kernel remove the static hint in giveup_altivec that assumes
a userspace task is using altivec.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Remove second definition of STACK_FRAME_OVERHEAD</title>
<updated>2010-11-29T04:48:23+00:00</updated>
<author>
<name>Stephen Rothwell</name>
<email>sfr@canb.auug.org.au</email>
</author>
<published>2010-11-18T15:06:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=46f5221049bb46b0188aad6b6dfab5dbc778be22'/>
<id>46f5221049bb46b0188aad6b6dfab5dbc778be22</id>
<content type='text'>
Since STACK_FRAME_OVERHEAD is defined in asm/ptrace.h and that
is ASSEMBER safe, we can just include that instead of going via
asm-offsets.h.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since STACK_FRAME_OVERHEAD is defined in asm/ptrace.h and that
is ASSEMBER safe, we can just include that instead of going via
asm-offsets.h.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix usage of 64-bit instruction in 32-bit altivec code</title>
<updated>2009-12-09T07:10:12+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2009-12-08T18:45:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e090aa80321b64c3b793f3b047e31ecf1af9538d'/>
<id>e090aa80321b64c3b793f3b047e31ecf1af9538d</id>
<content type='text'>
e821ea70f3b4873b50056a1e0f74befed1014c09 introduced a bug by copying
some 64-bit originated code as-is to be used by both 32 and 64-bit
but this code contains a 64-bit ony "cmpdi" instruction.

This changes it to cmpwi, which is fine since VRSAVE can only contains
a 32-bit value anyway.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
CC: &lt;stable@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
e821ea70f3b4873b50056a1e0f74befed1014c09 introduced a bug by copying
some 64-bit originated code as-is to be used by both 32 and 64-bit
but this code contains a 64-bit ony "cmpdi" instruction.

This changes it to cmpwi, which is fine since VRSAVE can only contains
a 32-bit value anyway.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
CC: &lt;stable@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Use names rather than numbers for SPRGs (v2)</title>
<updated>2009-08-20T00:12:27+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2009-07-14T20:52:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ee43eb788b3a06425fffb912677e2e1c8b00dd3b'/>
<id>ee43eb788b3a06425fffb912677e2e1c8b00dd3b</id>
<content type='text'>
The kernel uses SPRG registers for various purposes, typically in
low level assembly code as scratch registers or to hold per-cpu
global infos such as the PACA or the current thread_info pointer.

We want to be able to easily shuffle the usage of those registers
as some implementations have specific constraints realted to some
of them, for example, some have userspace readable aliases, etc..
and the current choice isn't always the best.

This patch should not change any code generation, and replaces the
usage of SPRN_SPRGn everywhere in the kernel with a named replacement
and adds documentation next to the definition of the names as to
what those are used for on each processor family.

The only parts that still use the original numbers are bits of KVM
or suspend/resume code that just blindly needs to save/restore all
the SPRGs.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kernel uses SPRG registers for various purposes, typically in
low level assembly code as scratch registers or to hold per-cpu
global infos such as the PACA or the current thread_info pointer.

We want to be able to easily shuffle the usage of those registers
as some implementations have specific constraints realted to some
of them, for example, some have userspace readable aliases, etc..
and the current choice isn't always the best.

This patch should not change any code generation, and replaces the
usage of SPRN_SPRGn everywhere in the kernel with a named replacement
and adds documentation next to the definition of the names as to
what those are used for on each processor family.

The only parts that still use the original numbers are bits of KVM
or suspend/resume code that just blindly needs to save/restore all
the SPRGs.

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix another bug in move of altivec code to vector.S</title>
<updated>2009-07-15T07:41:46+00:00</updated>
<author>
<name>Andreas Schwab</name>
<email>schwab@linux-m68k.org</email>
</author>
<published>2009-07-10T11:17:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0115cb544b0a6709e9cf3de615e150d22e7d9d10'/>
<id>0115cb544b0a6709e9cf3de615e150d22e7d9d10</id>
<content type='text'>
When moving load_up_altivec to vector.S a typo in a comment caused a
thinko setting the wrong variable.

Signed-off-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When moving load_up_altivec to vector.S a typo in a comment caused a
thinko setting the wrong variable.

Signed-off-by: Andreas Schwab &lt;schwab@linux-m68k.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
