<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/powerpc/lib, branch v4.19</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>powerpc/lib: fix book3s/32 boot failure due to code patching</title>
<updated>2018-10-02T13:34:14+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2018-10-01T12:21:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b45ba4a51cde29b2939365ef0c07ad34c8321789'/>
<id>b45ba4a51cde29b2939365ef0c07ad34c8321789</id>
<content type='text'>
Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init
sections") accesses 'init_mem_is_free' flag too early, before the
kernel is relocated. This provokes early boot failure (before the
console is active).

As it is not necessary to do this verification that early, this
patch moves the test into patch_instruction() instead of
__patch_instruction().

This modification also has the advantage of avoiding unnecessary
remappings.

Fixes: 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections")
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init
sections") accesses 'init_mem_is_free' flag too early, before the
kernel is relocated. This provokes early boot failure (before the
console is active).

As it is not necessary to do this verification that early, this
patch moves the test into patch_instruction() instead of
__patch_instruction().

This modification also has the advantage of avoiding unnecessary
remappings.

Fixes: 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections")
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: fix csum_ipv6_magic() on little endian platforms</title>
<updated>2018-09-20T11:12:28+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2018-09-10T06:09:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=85682a7e3b9c664995ad477520f917039afdc330'/>
<id>85682a7e3b9c664995ad477520f917039afdc330</id>
<content type='text'>
On little endian platforms, csum_ipv6_magic() keeps len and proto in
CPU byte order. This generates a bad results leading to ICMPv6 packets
from other hosts being dropped by powerpc64le platforms.

In order to fix this, len and proto should be converted to network
byte order ie bigendian byte order. However checksumming 0x12345678
and 0x56341278 provide the exact same result so it is enough to
rotate the sum of len and proto by 1 byte.

PPC32 only support bigendian so the fix is needed for PPC64 only

Fixes: e9c4943a107b ("powerpc: Implement csum_ipv6_magic in assembly")
Reported-by: Jianlin Shi &lt;jishi@redhat.com&gt;
Reported-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 4.18+
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Tested-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On little endian platforms, csum_ipv6_magic() keeps len and proto in
CPU byte order. This generates a bad results leading to ICMPv6 packets
from other hosts being dropped by powerpc64le platforms.

In order to fix this, len and proto should be converted to network
byte order ie bigendian byte order. However checksumming 0x12345678
and 0x56341278 provide the exact same result so it is enough to
rotate the sum of len and proto by 1 byte.

PPC32 only support bigendian so the fix is needed for PPC64 only

Fixes: e9c4943a107b ("powerpc: Implement csum_ipv6_magic in assembly")
Reported-by: Jianlin Shi &lt;jishi@redhat.com&gt;
Reported-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 4.18+
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Tested-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Avoid code patching freed init sections</title>
<updated>2018-09-18T12:42:54+00:00</updated>
<author>
<name>Michael Neuling</name>
<email>mikey@neuling.org</email>
</author>
<published>2018-09-14T01:14:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=51c3c62b58b357e8d35e4cc32f7b4ec907426fe3'/>
<id>51c3c62b58b357e8d35e4cc32f7b4ec907426fe3</id>
<content type='text'>
This stops us from doing code patching in init sections after they've
been freed.

In this chain:
  kvm_guest_init() -&gt;
    kvm_use_magic_page() -&gt;
      fault_in_pages_readable() -&gt;
	 __get_user() -&gt;
	   __get_user_nocheck() -&gt;
	     barrier_nospec();

We have a code patching location at barrier_nospec() and
kvm_guest_init() is an init function. This whole chain gets inlined,
so when we free the init section (hence kvm_guest_init()), this code
goes away and hence should no longer be patched.

We seen this as userspace memory corruption when using a memory
checker while doing partition migration testing on powervm (this
starts the code patching post migration via
/sys/kernel/mobility/migration). In theory, it could also happen when
using /sys/kernel/debug/powerpc/barrier_nospec.

Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Reviewed-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Reviewed-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This stops us from doing code patching in init sections after they've
been freed.

In this chain:
  kvm_guest_init() -&gt;
    kvm_use_magic_page() -&gt;
      fault_in_pages_readable() -&gt;
	 __get_user() -&gt;
	   __get_user_nocheck() -&gt;
	     barrier_nospec();

We have a code patching location at barrier_nospec() and
kvm_guest_init() is an init function. This whole chain gets inlined,
so when we free the init section (hence kvm_guest_init()), this code
goes away and hence should no longer be patched.

We seen this as userspace memory corruption when using a memory
checker while doing partition migration testing on powervm (this
starts the code patching post migration via
/sys/kernel/mobility/migration). In theory, it could also happen when
using /sys/kernel/debug/powerpc/barrier_nospec.

Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Reviewed-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Reviewed-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled</title>
<updated>2018-08-10T12:12:35+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2018-08-09T08:14:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fa54a981ea7a852c145b05c95abba11e81fc1157'/>
<id>fa54a981ea7a852c145b05c95abba11e81fc1157</id>
<content type='text'>
The symbol memcpy_nocache_branch defined in order to allow patching
of memset function once cache is enabled leads to confusing reports
by perf tool.

Using the new patch_site functionality solves this issue.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The symbol memcpy_nocache_branch defined in order to allow patching
of memset function once cache is enabled leads to confusing reports
by perf tool.

Using the new patch_site functionality solves this issue.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: Copy as much as possible in __copy_tofrom_user</title>
<updated>2018-08-07T14:32:36+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@ozlabs.org</email>
</author>
<published>2018-08-03T10:13:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f8db2007ff5838aff696bd4297eefcc77af2cf46'/>
<id>f8db2007ff5838aff696bd4297eefcc77af2cf46</id>
<content type='text'>
In __copy_tofrom_user, if we encounter an exception on a store, we
stop copying and return the number of bytes not copied.  However,
if the store is wider than one byte and is to an unaligned address,
it is possible that the store operand overlaps a page boundary
and the exception occurred on the latter part of the store operand,
meaning that it would be possible to copy a few more bytes.  Since
copy_to_user is generally expected to copy as much as possible,
it would be better to copy those extra few bytes.  This adds code
to do that.  Since this edge case is not performance-critical,
the code has been written to be compact rather than as fast as
possible.

Signed-off-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In __copy_tofrom_user, if we encounter an exception on a store, we
stop copying and return the number of bytes not copied.  However,
if the store is wider than one byte and is to an unaligned address,
it is possible that the store operand overlaps a page boundary
and the exception occurred on the latter part of the store operand,
meaning that it would be possible to copy a few more bytes.  Since
copy_to_user is generally expected to copy as much as possible,
it would be better to copy those extra few bytes.  This adds code
to do that.  Since this edge case is not performance-critical,
the code has been written to be compact rather than as fast as
possible.

Signed-off-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>selftests/powerpc/64: Test all paths through copy routines</title>
<updated>2018-08-07T14:32:35+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@ozlabs.org</email>
</author>
<published>2018-08-03T10:13:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=98c45f51f767bfdd71d773cceaceb403352e51ae'/>
<id>98c45f51f767bfdd71d773cceaceb403352e51ae</id>
<content type='text'>
The hand-coded assembler 64-bit copy routines include feature sections
that select one code path or another depending on which CPU we are
executing on.  The self-tests for these copy routines end up testing
just one path.  This adds a mechanism for selecting any desired code
path at compile time, and makes 2 or 3 versions of each test, each
using a different code path, so as to cover all the possible paths.

Signed-off-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
[mpe: Add -mcpu=power4 to CFLAGS for older compilers]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The hand-coded assembler 64-bit copy routines include feature sections
that select one code path or another depending on which CPU we are
executing on.  The self-tests for these copy routines end up testing
just one path.  This adds a mechanism for selecting any desired code
path at compile time, and makes 2 or 3 versions of each test, each
using a different code path, so as to cover all the possible paths.

Signed-off-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
[mpe: Add -mcpu=power4 to CFLAGS for older compilers]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: Make exception table clearer in __copy_tofrom_user_base</title>
<updated>2018-08-07T14:32:34+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@ozlabs.org</email>
</author>
<published>2018-08-03T10:13:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a7c81ce398e2ad304f61d6167155f3ef65a96524'/>
<id>a7c81ce398e2ad304f61d6167155f3ef65a96524</id>
<content type='text'>
This aims to make the generation of exception table entries for the
loads and stores in __copy_tofrom_user_base clearer and easier to
verify.  Instead of having a series of local labels on the loads and
stores, with a series of corresponding labels later for the exception
handlers, we now use macros to generate exception table entries at the
point of each load and store that could potentially trap.  We do this
with the macros lex (load exception) and stex (store exception).
These macros are used right before the load or store to which they
apply.

Some complexity is introduced by the fact that we have some more work
to do after hitting an exception, because we need to calculate and
return the number of bytes not copied.  The code uses r3 as the
current pointer into the destination buffer, that is, the address of
the first byte of the destination that has not been modified.
However, at various points in the copy loops, r3 can be 4, 8, 16 or 24
bytes behind that point.

To express this offset in an understandable way, we define a symbol
r3_offset which is updated at various points so that it equal to the
difference between the address of the first unmodified byte of the
destination and the value in r3.  (In fact it only needs to be
accurate at the point of each lex or stex macro invocation.)

The rules for updating r3_offset are as follows:

* It starts out at 0
* An addi r3,r3,N instruction decreases r3_offset by N
* A store instruction (stb, sth, stw, std) to N(r3)
  increases r3_offset by the width of the store (1, 2, 4, 8)
* A store with update instruction (stbu, sthu, stwu, stdu) to N(r3)
  sets r3_offset to the width of the store.

There is some trickiness to the way that the lex and stex macros and
the associated exception handlers work.  I would have liked to use
the current value of r3_offset in the name of the symbol used as
the exception handler, as in ".Lld_exc_$(r3_offset)" and then
have symbols .Lld_exc_0, .Lld_exc_8, .Lld_exc_16 etc. corresponding
to the offsets that needed to be added to r3.  However, I couldn't
see a way to do that with gas.

Instead, the exception handler address is .Lld_exc - r3_offset or
.Lst_exc - r3_offset, that is, the distance ahead of .Lld_exc/.Lst_exc
that we start executing is equal to the amount that we need to add to
r3.  This works because r3_offset is always a small multiple of 4,
and our instructions are 4 bytes long.  This means that before
.Lld_exc and .Lst_exc, we have a sequence of instructions that
increments r3 by 4, 8, 16 or 24 depending on where we start.  The
sequence increments r3 by 4 per instruction (on average).

We also replace the exception table for the 4k copy loop by a
macro per load or store.  These loads and stores all use exactly
the same exception handler, which simply resets the argument registers
r3, r4 and r5 to there original values and re-does the whole copy
using the slower loop.

Signed-off-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This aims to make the generation of exception table entries for the
loads and stores in __copy_tofrom_user_base clearer and easier to
verify.  Instead of having a series of local labels on the loads and
stores, with a series of corresponding labels later for the exception
handlers, we now use macros to generate exception table entries at the
point of each load and store that could potentially trap.  We do this
with the macros lex (load exception) and stex (store exception).
These macros are used right before the load or store to which they
apply.

Some complexity is introduced by the fact that we have some more work
to do after hitting an exception, because we need to calculate and
return the number of bytes not copied.  The code uses r3 as the
current pointer into the destination buffer, that is, the address of
the first byte of the destination that has not been modified.
However, at various points in the copy loops, r3 can be 4, 8, 16 or 24
bytes behind that point.

To express this offset in an understandable way, we define a symbol
r3_offset which is updated at various points so that it equal to the
difference between the address of the first unmodified byte of the
destination and the value in r3.  (In fact it only needs to be
accurate at the point of each lex or stex macro invocation.)

The rules for updating r3_offset are as follows:

* It starts out at 0
* An addi r3,r3,N instruction decreases r3_offset by N
* A store instruction (stb, sth, stw, std) to N(r3)
  increases r3_offset by the width of the store (1, 2, 4, 8)
* A store with update instruction (stbu, sthu, stwu, stdu) to N(r3)
  sets r3_offset to the width of the store.

There is some trickiness to the way that the lex and stex macros and
the associated exception handlers work.  I would have liked to use
the current value of r3_offset in the name of the symbol used as
the exception handler, as in ".Lld_exc_$(r3_offset)" and then
have symbols .Lld_exc_0, .Lld_exc_8, .Lld_exc_16 etc. corresponding
to the offsets that needed to be added to r3.  However, I couldn't
see a way to do that with gas.

Instead, the exception handler address is .Lld_exc - r3_offset or
.Lst_exc - r3_offset, that is, the distance ahead of .Lld_exc/.Lst_exc
that we start executing is equal to the amount that we need to add to
r3.  This works because r3_offset is always a small multiple of 4,
and our instructions are 4 bytes long.  This means that before
.Lld_exc and .Lst_exc, we have a sequence of instructions that
increments r3 by 4, 8, 16 or 24 depending on where we start.  The
sequence increments r3 by 4 per instruction (on average).

We also replace the exception table for the 4k copy loop by a
macro per load or store.  These loads and stores all use exactly
the same exception handler, which simply resets the argument registers
r3, r4 and r5 to there original values and re-does the whole copy
using the slower loop.

Signed-off-by: Paul Mackerras &lt;paulus@ozlabs.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/asm: Add a patch_site macro &amp; helpers for patching instructions</title>
<updated>2018-08-07T14:32:25+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2018-07-23T15:07:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=06d0bbc6d0f56dacac3a79900e9a9a0d5972d818'/>
<id>06d0bbc6d0f56dacac3a79900e9a9a0d5972d818</id>
<content type='text'>
Add a macro and some helper C functions for patching single asm
instructions.

The gas macro means we can do something like:

  1:	nop
  	patch_site 1b, patch__foo

Which is less visually distracting than defining a GLOBAL symbol at 1,
and also doesn't pollute the symbol table which can confuse eg. perf.

These are obviously similar to our existing feature sections, but are
not automatically patched based on CPU/MMU features, rather they are
designed to be manually patched by C code at some arbitrary point.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a macro and some helper C functions for patching single asm
instructions.

The gas macro means we can do something like:

  1:	nop
  	patch_site 1b, patch__foo

Which is less visually distracting than defining a GLOBAL symbol at 1,
and also doesn't pollute the symbol table which can confuse eg. perf.

These are obviously similar to our existing feature sections, but are
not automatically patched based on CPU/MMU features, rather they are
designed to be manually patched by C code at some arbitrary point.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/fsl: Add barrier_nospec implementation for NXP PowerPC Book3E</title>
<updated>2018-08-07T14:32:24+00:00</updated>
<author>
<name>Diana Craciun</name>
<email>diana.craciun@nxp.com</email>
</author>
<published>2018-07-27T23:06:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ebcd1bfc33c7a90df941df68a6e5d4018c022fba'/>
<id>ebcd1bfc33c7a90df941df68a6e5d4018c022fba</id>
<content type='text'>
Implement the barrier_nospec as a isync;sync instruction sequence.
The implementation uses the infrastructure built for BOOK3S 64.

Signed-off-by: Diana Craciun &lt;diana.craciun@nxp.com&gt;
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement the barrier_nospec as a isync;sync instruction sequence.
The implementation uses the infrastructure built for BOOK3S 64.

Signed-off-by: Diana Craciun &lt;diana.craciun@nxp.com&gt;
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: Add CONFIG_PPC_BARRIER_NOSPEC</title>
<updated>2018-08-07T14:32:23+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2018-07-27T23:06:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=179ab1cbf883575c3a585bcfc0f2160f1d22a149'/>
<id>179ab1cbf883575c3a585bcfc0f2160f1d22a149</id>
<content type='text'>
Add a config symbol to encode which platforms support the
barrier_nospec speculation barrier. Currently this is just Book3S 64
but we will add Book3E in a future patch.

Signed-off-by: Diana Craciun &lt;diana.craciun@nxp.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a config symbol to encode which platforms support the
barrier_nospec speculation barrier. Currently this is just Book3S 64
but we will add Book3E in a future patch.

Signed-off-by: Diana Craciun &lt;diana.craciun@nxp.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
