<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/crypto/sha3_generic.c, branch v4.19</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux</title>
<updated>2018-08-03T09:55:12+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2018-08-03T09:55:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c5f5aeef9b55b362ad5a0e04e4b41cd63b208842'/>
<id>c5f5aeef9b55b362ad5a0e04e4b41cd63b208842</id>
<content type='text'>
Merge mainline to pick up c7513c2a2714 ("crypto/arm64: aes-ce-gcm -
add missing kernel_neon_begin/end pair").
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge mainline to pick up c7513c2a2714 ("crypto/arm64: aes-ce-gcm -
add missing kernel_neon_begin/end pair").
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: shash - remove useless setting of type flags</title>
<updated>2018-07-08T16:30:24+00:00</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2018-06-30T22:16:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e50944e219f908968a6e01fbd0e8811a33bd5f04'/>
<id>e50944e219f908968a6e01fbd0e8811a33bd5f04</id>
<content type='text'>
Many shash algorithms set .cra_flags = CRYPTO_ALG_TYPE_SHASH.  But this
is redundant with the C structure type ('struct shash_alg'), and
crypto_register_shash() already sets the type flag automatically,
clearing any type flag that was already there.  Apparently the useless
assignment has just been copy+pasted around.

So, remove the useless assignment from all the shash algorithms.

This patch shouldn't change any actual behavior.

Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Many shash algorithms set .cra_flags = CRYPTO_ALG_TYPE_SHASH.  But this
is redundant with the C structure type ('struct shash_alg'), and
crypto_register_shash() already sets the type flag automatically,
clearing any type flag that was already there.  Apparently the useless
assignment has just been copy+pasted around.

So, remove the useless assignment from all the shash algorithms.

This patch shouldn't change any actual behavior.

Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: don't optimize keccakf()</title>
<updated>2018-06-15T15:06:48+00:00</updated>
<author>
<name>Dmitry Vyukov</name>
<email>dvyukov@google.com</email>
</author>
<published>2018-06-08T09:53:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f044a84e040b85cd609851ac88ae8b54b2cc0b75'/>
<id>f044a84e040b85cd609851ac88ae8b54b2cc0b75</id>
<content type='text'>
keccakf() is the only function in kernel that uses __optimize() macro.
__optimize() breaks frame pointer unwinder as optimized code uses RBP,
and amusingly this always lead to degraded performance as gcc does not
inline across different optimizations levels, so keccakf() wasn't inlined
into its callers and keccakf_round() wasn't inlined into keccakf().

Drop __optimize() to resolve both problems.

Signed-off-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Fixes: 83dee2ce1ae7 ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize")
Reported-by: syzbot+37035ccfa9a0a017ffcf@syzkaller.appspotmail.com
Reported-by: syzbot+e073e4740cfbb3ae200b@syzkaller.appspotmail.com
Cc: linux-crypto@vger.kernel.org
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
keccakf() is the only function in kernel that uses __optimize() macro.
__optimize() breaks frame pointer unwinder as optimized code uses RBP,
and amusingly this always lead to degraded performance as gcc does not
inline across different optimizations levels, so keccakf() wasn't inlined
into its callers and keccakf_round() wasn't inlined into keccakf().

Drop __optimize() to resolve both problems.

Signed-off-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Fixes: 83dee2ce1ae7 ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize")
Reported-by: syzbot+37035ccfa9a0a017ffcf@syzkaller.appspotmail.com
Reported-by: syzbot+e073e4740cfbb3ae200b@syzkaller.appspotmail.com
Cc: linux-crypto@vger.kernel.org
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mn10300: Remove the architecture</title>
<updated>2018-03-09T22:19:56+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2018-03-08T09:48:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=739d875dd6982618020d30f58f8acf10f6076e6d'/>
<id>739d875dd6982618020d30f58f8acf10f6076e6d</id>
<content type='text'>
Remove the MN10300 arch as the hardware is defunct.

Suggested-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Masahiro Yamada &lt;yamada.masahiro@socionext.com&gt;
cc: linux-am33-list@redhat.com
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the MN10300 arch as the hardware is defunct.

Suggested-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: Masahiro Yamada &lt;yamada.masahiro@socionext.com&gt;
cc: linux-am33-list@redhat.com
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: sha3-generic - Use __optimize to support old compilers</title>
<updated>2018-02-08T11:38:12+00:00</updated>
<author>
<name>Geert Uytterhoeven</name>
<email>geert@linux-m68k.org</email>
</author>
<published>2018-02-01T10:22:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ba916b6a0339ed6cc6441ad83c097ab795dbdbc5'/>
<id>ba916b6a0339ed6cc6441ad83c097ab795dbdbc5</id>
<content type='text'>
With gcc-4.1.2:

    crypto/sha3_generic.c:39: warning: ‘__optimize__’ attribute directive ignored

Use the newly introduced __optimize macro to fix this.

Fixes: 83dee2ce1ae791c3 ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize")
Signed-off-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With gcc-4.1.2:

    crypto/sha3_generic.c:39: warning: ‘__optimize__’ attribute directive ignored

Use the newly introduced __optimize macro to fix this.

Fixes: 83dee2ce1ae791c3 ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize")
Signed-off-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Acked-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: sha3-generic - deal with oversize stack frames</title>
<updated>2018-02-08T11:37:08+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2018-01-27T09:18:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4767b9ad7d762876a5865a06465e13e139a01b6b'/>
<id>4767b9ad7d762876a5865a06465e13e139a01b6b</id>
<content type='text'>
As reported by kbuild test robot, the optimized SHA3 C implementation
compiles to mn10300 code that uses a disproportionate amount of stack
space, i.e.,

  crypto/sha3_generic.c: In function 'keccakf':
  crypto/sha3_generic.c:147:1: warning: the frame size of 1232 bytes is larger than 1024 bytes [-Wframe-larger-than=]

As kindly diagnosed by Arnd, this does not only occur when building for
the mn10300 architecture (which is what the report was about) but also
for h8300, and builds for other 32-bit architectures show an increase in
stack space utilization as well.

Given that SHA3 operates on 64-bit quantities, and keeps a state matrix
of 25 64-bit words, it is not surprising that 32-bit architectures with
few general purpose registers are impacted the most by this, and it is
therefore reasonable to implement a workaround that distinguishes between
32-bit and 64-bit architectures.

Arnd figured out that taking the round calculation out of the loop, and
inlining it explicitly but only on 64-bit architectures preserves most
of the performance gain achieved by the rewrite, and also gets rid of
the excessive use of stack space.

Reported-by: kbuild test robot &lt;fengguang.wu@intel.com&gt;
Suggested-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As reported by kbuild test robot, the optimized SHA3 C implementation
compiles to mn10300 code that uses a disproportionate amount of stack
space, i.e.,

  crypto/sha3_generic.c: In function 'keccakf':
  crypto/sha3_generic.c:147:1: warning: the frame size of 1232 bytes is larger than 1024 bytes [-Wframe-larger-than=]

As kindly diagnosed by Arnd, this does not only occur when building for
the mn10300 architecture (which is what the report was about) but also
for h8300, and builds for other 32-bit architectures show an increase in
stack space utilization as well.

Given that SHA3 operates on 64-bit quantities, and keeps a state matrix
of 25 64-bit words, it is not surprising that 32-bit architectures with
few general purpose registers are impacted the most by this, and it is
therefore reasonable to implement a workaround that distinguishes between
32-bit and 64-bit architectures.

Arnd figured out that taking the round calculation out of the loop, and
inlining it explicitly but only on 64-bit architectures preserves most
of the performance gain achieved by the rewrite, and also gets rid of
the excessive use of stack space.

Reported-by: kbuild test robot &lt;fengguang.wu@intel.com&gt;
Suggested-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: sha3-generic - export init/update/final routines</title>
<updated>2018-01-25T14:10:34+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2018-01-19T12:04:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6657674b23b8a8458a3222ec3da2fd376c78ae79'/>
<id>6657674b23b8a8458a3222ec3da2fd376c78ae79</id>
<content type='text'>
To allow accelerated implementations to fall back to the generic
routines, e.g., in contexts where a SIMD based implementation is
not allowed to run, expose the generic SHA3 init/update/final
routines to other modules.

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To allow accelerated implementations to fall back to the generic
routines, e.g., in contexts where a SIMD based implementation is
not allowed to run, expose the generic SHA3 init/update/final
routines to other modules.

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: sha3-generic - simplify code</title>
<updated>2018-01-25T14:10:33+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2018-01-19T12:04:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=beeb504adf3d08c0e916f43259e8e2ad6bdd30ee'/>
<id>beeb504adf3d08c0e916f43259e8e2ad6bdd30ee</id>
<content type='text'>
In preparation of exposing the generic SHA3 implementation to other
versions as a fallback, simplify the code, and remove an inconsistency
in the output handling (endian swabbing rsizw words of state before
writing the output does not make sense)

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In preparation of exposing the generic SHA3 implementation to other
versions as a fallback, simplify the code, and remove an inconsistency
in the output handling (endian swabbing rsizw words of state before
writing the output does not make sense)

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize</title>
<updated>2018-01-25T14:10:33+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2018-01-19T12:04:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=83dee2ce1ae791c3dc0c9d4d3a8d42cb109613f6'/>
<id>83dee2ce1ae791c3dc0c9d4d3a8d42cb109613f6</id>
<content type='text'>
The way the KECCAK transform is currently coded involves many references
into the state array using indexes that are calculated at runtime using
simple but non-trivial arithmetic. This forces the compiler to treat the
state matrix as an array in memory rather than keep it in registers,
which results in poor performance.

So instead, let's rephrase the algorithm using fixed array indexes only.
This helps the compiler keep the state matrix in registers, resulting
in the following speedup (SHA3-256 performance in cycles per byte):

                                            before   after   speedup
  Intel Core i7 @ 2.0 GHz (2.9 turbo)        100.6    35.7     2.8x
  Cortex-A57 @ 2.0 GHz (64-bit mode)         101.6    12.7     8.0x
  Cortex-A53 @ 1.0 GHz                       224.4    15.8    14.2x
  Cortex-A57 @ 2.0 GHz (32-bit mode)         201.8    63.0     3.2x

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The way the KECCAK transform is currently coded involves many references
into the state array using indexes that are calculated at runtime using
simple but non-trivial arithmetic. This forces the compiler to treat the
state matrix as an array in memory rather than keep it in registers,
which results in poor performance.

So instead, let's rephrase the algorithm using fixed array indexes only.
This helps the compiler keep the state matrix in registers, resulting
in the following speedup (SHA3-256 performance in cycles per byte):

                                            before   after   speedup
  Intel Core i7 @ 2.0 GHz (2.9 turbo)        100.6    35.7     2.8x
  Cortex-A57 @ 2.0 GHz (64-bit mode)         101.6    12.7     8.0x
  Cortex-A53 @ 1.0 GHz                       224.4    15.8    14.2x
  Cortex-A57 @ 2.0 GHz (32-bit mode)         201.8    63.0     3.2x

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>crypto: sha3-generic - fixes for alignment and big endian operation</title>
<updated>2018-01-25T14:10:32+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2018-01-19T12:04:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c013cee99d5a18aec8c71fee8f5f41369cd12595'/>
<id>c013cee99d5a18aec8c71fee8f5f41369cd12595</id>
<content type='text'>
Ensure that the input is byte swabbed before injecting it into the
SHA3 transform. Use the get_unaligned() accessor for this so that
we don't perform unaligned access inadvertently on architectures
that do not support that.

Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 53964b9ee63b7075 ("crypto: sha3 - Add SHA-3 hash algorithm")
Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Ensure that the input is byte swabbed before injecting it into the
SHA3 transform. Use the get_unaligned() accessor for this so that
we don't perform unaligned access inadvertently on architectures
that do not support that.

Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 53964b9ee63b7075 ("crypto: sha3 - Add SHA-3 hash algorithm")
Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
