<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/arm64/lib, branch v5.1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>arm64: string: use asm EXPORT_SYMBOL()</title>
<updated>2018-12-10T11:50:12+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2018-12-07T18:08:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ac0e8c72b03b0e2634a355b99e1d3b780090c403'/>
<id>ac0e8c72b03b0e2634a355b99e1d3b780090c403</id>
<content type='text'>
For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the string routine
exports to the assembly files the functions are defined in. Routines
which should only be exported for !KASAN builds are exported using the
EXPORT_SYMBOL_NOKASAN() helper.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the string routine
exports to the assembly files the functions are defined in. Routines
which should only be exported for !KASAN builds are exported using the
EXPORT_SYMBOL_NOKASAN() helper.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: uaccess: use asm EXPORT_SYMBOL()</title>
<updated>2018-12-10T11:50:11+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2018-12-07T18:08:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=56c08ec5162c7cf87632c59714ac625e7d54106a'/>
<id>56c08ec5162c7cf87632c59714ac625e7d54106a</id>
<content type='text'>
For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the uaccess exports
to the assembly files the functions are defined in.  As we have to
include &lt;asm/assembler.h&gt;, the existing includes are fixed to follow the
usual ordering conventions.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the uaccess exports
to the assembly files the functions are defined in.  As we have to
include &lt;asm/assembler.h&gt;, the existing includes are fixed to follow the
usual ordering conventions.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: page: use asm EXPORT_SYMBOL()</title>
<updated>2018-12-10T11:50:11+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2018-12-07T18:08:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=50fdecb292e0b2d98b0b2bf319a4faa1aa7d666f'/>
<id>50fdecb292e0b2d98b0b2bf319a4faa1aa7d666f</id>
<content type='text'>
For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the copy_page and
clear_page exports to the assembly files the functions are defined in.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the copy_page and
clear_page exports to the assembly files the functions are defined in.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: tishift: use asm EXPORT_SYMBOL()</title>
<updated>2018-12-10T11:50:11+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2018-12-07T18:08:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=abb77f3d96407cf30c2068f4a23a14b9540a1c1f'/>
<id>abb77f3d96407cf30c2068f4a23a14b9540a1c1f</id>
<content type='text'>
For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the tishift exports
to the assembly file the functions are defined in.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For a while now it's been possible to use EXPORT_SYMBOL() in assembly
files, which allows us to place exports immediately after assembly
functions, as we do for C functions.

As a step towards removing arm64ksyms.c, let's move the tishift exports
to the assembly file the functions are defined in.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: crypto: add NEON accelerated XOR implementation</title>
<updated>2018-12-06T16:47:06+00:00</updated>
<author>
<name>Jackie Liu</name>
<email>liuyun01@kylinos.cn</email>
</author>
<published>2018-12-04T01:43:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cc9f8349cb33965120a96c12e05d00676162eb7f'/>
<id>cc9f8349cb33965120a96c12e05d00676162eb7f</id>
<content type='text'>
This is a NEON acceleration method that can improve
performance by approximately 20%. I got the following
data from the centos 7.5 on Huawei's HISI1616 chip:

[ 93.837726] xor: measuring software checksum speed
[ 93.874039]   8regs  : 7123.200 MB/sec
[ 93.914038]   32regs : 7180.300 MB/sec
[ 93.954043]   arm64_neon: 9856.000 MB/sec
[ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec)

I believe this code can bring some optimization for
all arm64 platform. thanks for Ard Biesheuvel's suggestions.

Signed-off-by: Jackie Liu &lt;liuyun01@kylinos.cn&gt;
Reviewed-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a NEON acceleration method that can improve
performance by approximately 20%. I got the following
data from the centos 7.5 on Huawei's HISI1616 chip:

[ 93.837726] xor: measuring software checksum speed
[ 93.874039]   8regs  : 7123.200 MB/sec
[ 93.914038]   32regs : 7180.300 MB/sec
[ 93.954043]   arm64_neon: 9856.000 MB/sec
[ 93.954047] xor: using function: arm64_neon (9856.000 MB/sec)

I believe this code can bring some optimization for
all arm64 platform. thanks for Ard Biesheuvel's suggestions.

Signed-off-by: Jackie Liu &lt;liuyun01@kylinos.cn&gt;
Reviewed-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/lib: improve CRC32 performance for deep pipelines</title>
<updated>2018-11-30T13:58:04+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2018-11-27T17:42:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=efdb25efc7645b326cd5eb82be5feeabe167c24e'/>
<id>efdb25efc7645b326cd5eb82be5feeabe167c24e</id>
<content type='text'>
Improve the performance of the crc32() asm routines by getting rid of
most of the branches and small sized loads on the common path.

Instead, use a branchless code path involving overlapping 16 byte
loads to process the first (length % 32) bytes, and process the
remainder using a loop that processes 32 bytes at a time.

Tested using the following test program:

  #include &lt;stdlib.h&gt;

  extern void crc32_le(unsigned short, char const*, int);

  int main(void)
  {
    static const char buf[4096];

    srand(20181126);

    for (int i = 0; i &lt; 100 * 1000 * 1000; i++)
      crc32_le(0, buf, rand() % 1024);

    return 0;
  }

On Cortex-A53 and Cortex-A57, the performance regresses but only very
slightly. On Cortex-A72 however, the performance improves from

  $ time ./crc32

  real  0m10.149s
  user  0m10.149s
  sys   0m0.000s

to

  $ time ./crc32

  real  0m7.915s
  user  0m7.915s
  sys   0m0.000s

Cc: Rui Sun &lt;sunrui26@huawei.com&gt;
Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Improve the performance of the crc32() asm routines by getting rid of
most of the branches and small sized loads on the common path.

Instead, use a branchless code path involving overlapping 16 byte
loads to process the first (length % 32) bytes, and process the
remainder using a loop that processes 32 bytes at a time.

Tested using the following test program:

  #include &lt;stdlib.h&gt;

  extern void crc32_le(unsigned short, char const*, int);

  int main(void)
  {
    static const char buf[4096];

    srand(20181126);

    for (int i = 0; i &lt; 100 * 1000 * 1000; i++)
      crc32_le(0, buf, rand() % 1024);

    return 0;
  }

On Cortex-A53 and Cortex-A57, the performance regresses but only very
slightly. On Cortex-A72 however, the performance improves from

  $ time ./crc32

  real  0m10.149s
  user  0m10.149s
  sys   0m0.000s

to

  $ time ./crc32

  real  0m7.915s
  user  0m7.915s
  sys   0m0.000s

Cc: Rui Sun &lt;sunrui26@huawei.com&gt;
Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: lib: use C string functions with KASAN enabled</title>
<updated>2018-10-26T23:25:18+00:00</updated>
<author>
<name>Andrey Ryabinin</name>
<email>aryabinin@virtuozzo.com</email>
</author>
<published>2018-10-26T22:02:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=19a2ca0fb560fd7be7b5293c6b652c6d6078dcde'/>
<id>19a2ca0fb560fd7be7b5293c6b652c6d6078dcde</id>
<content type='text'>
ARM64 has asm implementation of memchr(), memcmp(), str[r]chr(),
str[n]cmp(), str[n]len().  KASAN don't see memory accesses in asm code,
thus it can potentially miss many bugs.

Ifdef out __HAVE_ARCH_* defines of these functions when KASAN is enabled,
so the generic implementations from lib/string.c will be used.

We can't just remove the asm functions because efistub uses them.  And we
can't have two non-weak functions either, so declare the asm functions as
weak.

Link: http://lkml.kernel.org/r/20180920135631.23833-2-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Reported-by: Kyeongdon Kim &lt;kyeongdon.kim@lge.com&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ARM64 has asm implementation of memchr(), memcmp(), str[r]chr(),
str[n]cmp(), str[n]len().  KASAN don't see memory accesses in asm code,
thus it can potentially miss many bugs.

Ifdef out __HAVE_ARCH_* defines of these functions when KASAN is enabled,
so the generic implementations from lib/string.c will be used.

We can't just remove the asm functions because efistub uses them.  And we
can't have two non-weak functions either, so declare the asm functions as
weak.

Link: http://lkml.kernel.org/r/20180920135631.23833-2-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Reported-by: Kyeongdon Kim &lt;kyeongdon.kim@lge.com&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64: lse: remove -fcall-used-x0 flag</title>
<updated>2018-09-24T09:56:24+00:00</updated>
<author>
<name>Tri Vo</name>
<email>trong@android.com</email>
</author>
<published>2018-09-19T19:27:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2a6c7c367de82951c98a290a21156770f6f82c84'/>
<id>2a6c7c367de82951c98a290a21156770f6f82c84</id>
<content type='text'>
x0 is not callee-saved in the PCS. So there is no need to specify
-fcall-used-x0.

Clang doesn't currently support -fcall-used flags. This patch will help
building the kernel with clang.

Tested-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Tri Vo &lt;trong@android.com&gt;
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
x0 is not callee-saved in the PCS. So there is no need to specify
-fcall-used-x0.

Clang doesn't currently support -fcall-used flags. This patch will help
building the kernel with clang.

Tested-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Tri Vo &lt;trong@android.com&gt;
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>arm64/lib: add accelerated crc32 routines</title>
<updated>2018-09-10T15:10:53+00:00</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2018-08-27T11:02:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7481cddf29ede204b475facc40e6f65459939881'/>
<id>7481cddf29ede204b475facc40e6f65459939881</id>
<content type='text'>
Unlike crc32c(), which is wired up to the crypto API internally so the
optimal driver is selected based on the platform's capabilities,
crc32_le() is implemented as a library function using a slice-by-8 table
based C implementation. Even though few of the call sites may be
bottlenecks, calling a time variant implementation with a non-negligible
D-cache footprint is a bit of a waste, given that ARMv8.1 and up mandates
support for the CRC32 instructions that were optional in ARMv8.0, but are
already widely available, even on the Cortex-A53 based Raspberry Pi.

So implement routines that use these instructions if available, and fall
back to the existing generic routines otherwise. The selection is based
on alternatives patching.

Note that this unconditionally selects CONFIG_CRC32 as a builtin. Since
CRC32 is relied upon by core functionality such as CONFIG_OF_FLATTREE,
this just codifies the status quo.

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Unlike crc32c(), which is wired up to the crypto API internally so the
optimal driver is selected based on the platform's capabilities,
crc32_le() is implemented as a library function using a slice-by-8 table
based C implementation. Even though few of the call sites may be
bottlenecks, calling a time variant implementation with a non-negligible
D-cache footprint is a bit of a waste, given that ARMv8.1 and up mandates
support for the CRC32 instructions that were optional in ARMv8.0, but are
already widely available, even on the Cortex-A53 based Raspberry Pi.

So implement routines that use these instructions if available, and fall
back to the existing generic routines otherwise. The selection is based
on alternatives patching.

Note that this unconditionally selects CONFIG_CRC32 as a builtin. Since
CRC32 is relied upon by core functionality such as CONFIG_OF_FLATTREE,
this just codifies the status quo.

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>locking/atomics/arm64: Replace our atomic/lock bitop implementations with asm-generic</title>
<updated>2018-06-21T10:52:12+00:00</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2018-06-19T12:53:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7c8fc35dfc32dfa97d8a1dc25dbd064cf83936db'/>
<id>7c8fc35dfc32dfa97d8a1dc25dbd064cf83936db</id>
<content type='text'>
The &lt;asm-generic/bitops/{atomic,lock}.h&gt; implementations are built around
the atomic-fetch ops, which we implement efficiently for both LSE and
LL/SC systems. Use that instead of our hand-rolled, out-of-line bitops.S.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: yamada.masahiro@socionext.com
Link: https://lore.kernel.org/lkml/1529412794-17720-9-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The &lt;asm-generic/bitops/{atomic,lock}.h&gt; implementations are built around
the atomic-fetch ops, which we implement efficiently for both LSE and
LL/SC systems. Use that instead of our hand-rolled, out-of-line bitops.S.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arm-kernel@lists.infradead.org
Cc: yamada.masahiro@socionext.com
Link: https://lore.kernel.org/lkml/1529412794-17720-9-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
