| Age | Commit message (Collapse) | Author |
|
Consolidate the ChaCha code into a single module (excluding
chacha-block-generic.c which remains always built-in for random.c),
similar to various other algorithms:
- Each arch now provides a header file lib/crypto/$(SRCARCH)/chacha.h,
replacing lib/crypto/$(SRCARCH)/chacha*.c. The header defines
chacha_crypt_arch() and hchacha_block_arch(). It is included by
lib/crypto/chacha.c, and thus the code gets built into the single
libchacha module, with improved inlining in some cases.
- Whether arch-optimized ChaCha is buildable is now controlled centrally
by lib/crypto/Kconfig instead of by lib/crypto/$(SRCARCH)/Kconfig.
The conditions for enabling it remain the same as before, and it
remains enabled by default.
- Any additional arch-specific translation units for the optimized
ChaCha code, such as assembly files, are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
This removes the last use for the Makefile and Kconfig files in the
arm64, mips, powerpc, riscv, and s390 subdirectories of lib/crypto/. So
also remove those files and the references to them.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Rename libchacha.c to chacha.c to make the naming consistent with other
algorithms and allow additional source files to be added to the
libchacha module. This file currently contains chacha_crypt_generic(),
but it will soon be updated to contain chacha_crypt().
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Rename chacha.c to chacha-block-generic.c to free up the name chacha.c
for the high-level API entry points (chacha_crypt() and
hchacha_block()), similar to the other algorithms.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
chacha_is_arch_optimized() is no longer used, so remove it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305
implementation for riscv authored by Andy Polyakov. The file
'poly1305-riscv.pl' is taken straight from
https://github.com/dot-asm/cryptogams commit
5e3fba73576244708a752fa61a8e93e587f271bb. This patch was tested on
SpacemiT X60, with 2~2.5x improvement over generic implementation.
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Signed-off-by: Zhihang Shao <zhihang.shao.iscas@gmail.com>
[EB: ported to lib/crypto/riscv/]
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Consolidate the Poly1305 code into a single module, similar to various
other algorithms (SHA-1, SHA-256, SHA-512, etc.):
- Each arch now provides a header file lib/crypto/$(SRCARCH)/poly1305.h,
replacing lib/crypto/$(SRCARCH)/poly1305*.c. The header defines
poly1305_block_init(), poly1305_blocks(), poly1305_emit(), and
optionally poly1305_mod_init_arch(). It is included by
lib/crypto/poly1305.c, and thus the code gets built into the single
libpoly1305 module, with improved inlining in some cases.
- Whether arch-optimized Poly1305 is buildable is now controlled
centrally by lib/crypto/Kconfig instead of by
lib/crypto/$(SRCARCH)/Kconfig. The conditions for enabling it remain
the same as before, and it remains enabled by default. (The PPC64 one
remains unconditionally disabled due to 'depends on BROKEN'.)
- Any additional arch-specific translation units for the optimized
Poly1305 code, such as assembly files, are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
A special consideration is needed because the Adiantum code uses the
poly1305_core_*() functions directly. For now, just carry forward that
approach. This means retaining the CRYPTO_LIB_POLY1305_GENERIC kconfig
symbol, and keeping the poly1305_core_*() functions in separate
translation units. So it's not quite as streamlined I've done with the
other hash functions, but we still get a single libpoly1305 module.
Note: to see the diff from the arm, arm64, and x86 .c files to the new
.h files, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
poly1305_is_arch_optimized() is unused, so remove it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Drop 'inline' from all the *_mod_init_arch() functions so that the
compiler will warn about any bugs where they are unused due to not being
wired up properly. (There are no such bugs currently, so this just
establishes a more robust convention for the future. Of course, these
functions also tend to get inlined anyway, regardless of the keyword.)
Link: https://lore.kernel.org/r/20250816020457.432040-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add a KUnit test suite for the MD5 library functions, including the
corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suite, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Link: https://lore.kernel.org/r/20250805222855.10362-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the sparc-optimized MD5 code via sparc-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
sparc-optimized, and it fixes the longstanding issue where the
sparc-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Note: to see the diff from arch/sparc/crypto/md5_glue.c to
lib/crypto/sparc/md5.h, view this commit with 'git show -M10'.
Link: https://lore.kernel.org/r/20250805222855.10362-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the powerpc-optimized MD5 code via powerpc-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
powerpc-optimized, and it fixes the longstanding issue where the
powerpc-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Link: https://lore.kernel.org/r/20250805222855.10362-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the mips-optimized MD5 code via mips-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function. This is much simpler, it makes the MD5 library functions be
mips-optimized, and it fixes the longstanding issue where the
mips-optimized MD5 code was disabled by default. MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.
Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-md5.c
to lib/crypto/mips/md5.h, view this commit with 'git show -M10'.
Link: https://lore.kernel.org/r/20250805222855.10362-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add library functions for MD5, including HMAC support. The MD5
implementation is derived from crypto/md5.c. This closely mirrors the
corresponding SHA-1 and SHA-2 changes.
Like SHA-1 and SHA-2, support for architecture-optimized MD5
implementations is included. I originally proposed dropping those, but
unfortunately there is an AF_ALG user of the PowerPC MD5 code
(https://lore.kernel.org/r/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
and dropping that code would be viewed as a performance regression. We
don't add new software algorithm implementations purely for AF_ALG, as
escalating to kernel mode merely to do calculations that could be done
in userspace is inefficient and is completely the wrong design. But
since this one already existed, it gets grandfathered in for now. An
objection was also raised to dropping the SPARC64 MD5 code because it
utilizes the CPU's direct support for MD5, although it remains unclear
that anyone is using that. Regardless, we'll keep these around for now.
Note that while MD5 is a legacy algorithm that is vulnerable to
practical collision attacks, it still has various in-kernel users that
implement legacy protocols. Switching to a simple library API, which is
the way the code should have been organized originally, will greatly
simplify their code. For example:
MD5:
drivers/md/dm-crypt.c (for lmk IV generation)
fs/nfsd/nfs4recover.c
fs/ecryptfs/
fs/smb/client/
net/{ipv4,ipv6}/ (for TCP-MD5 signatures)
HMAC-MD5:
fs/smb/client/
fs/smb/server/
(Also net/sctp/ if it continues using HMAC-MD5 for cookie generation.
However, that use case has the flexibility to upgrade to a more modern
algorithm, which I'll be proposing instead.)
As usual, the "md5" and "hmac(md5)" crypto_shash algorithms will also be
reimplemented on top of these library functions. For "hmac(md5)" this
will provide a faster, more streamlined implementation.
Link: https://lore.kernel.org/r/20250805222855.10362-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Since sha512_kunit tests the fallback code paths without using
crypto_simd_disabled_for_test, make the SHA-512 code just use the
underlying may_use_simd() and irq_fpu_usable() functions directly
instead of crypto_simd_usable(). This eliminates an unnecessary layer.
Link: https://lore.kernel.org/r/20250731223651.136939-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Since sha256_kunit tests the fallback code paths without using
crypto_simd_disabled_for_test, make the SHA-256 code just use the
underlying may_use_simd() and irq_fpu_usable() functions directly
instead of crypto_simd_usable(). This eliminates an unnecessary layer.
While doing this, also add likely() annotations, and fix a minor
inconsistency where the static keys in the sha256.h files were in a
different place than in the corresponding sha1.h and sha512.h files.
Link: https://lore.kernel.org/r/20250731223510.136650-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
make clean does not check the kernel config when removing files. As
such, additions to clean-files under CONFIG_ARM or CONFIG_ARM64 are not
evaluated. For example, when building on arm64, this means that
lib/crypto/arm64/sha{256,512}-core.S are left over after make clean.
Set clean-files unconditionally to ensure that make clean removes these
files.
Fixes: e96cb9507f2d ("lib/crypto: sha256: Consolidate into single module")
Fixes: 24c91b62ac50 ("lib/crypto: arm/sha512: Migrate optimized SHA-512 code to library")
Fixes: 60e3f1e9b7a5 ("lib/crypto: arm64/sha512: Migrate optimized SHA-512 code to library")
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Link: https://lore.kernel.org/r/20250814-crypto_clean-v2-1-659a2dc86302@columbia.edu
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Update the help text for CRYPTO_LIB_SHA1 and CRYPTO_LIB_SHA256 to
reflect the addition of HMAC support, and to be consistent with
CRYPTO_LIB_SHA512.
Link: https://lore.kernel.org/r/20250731224218.137947-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Rename run_irq_test() to kunit_run_irq_test() and move it to a public
header so that it can be reused by crc_kunit.
Link: https://lore.kernel.org/r/20250811182631.376302-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:
- Standardize on the __ASSEMBLER__ macro that is provided by GCC and
Clang compilers and replace __ASSEMBLY__ with __ASSEMBLER__ in both
uapi and non-uapi headers
- Explicitly include <linux/export.h> in architecture and driver files
which contain an EXPORT_SYMBOL() and remove the include from the
files which do not contain the EXPORT_SYMBOL()
- Use the full title of "z/Architecture Principles of Operation" manual
and the name of a section where facility bits are listed
- Use -D__DISABLE_EXPORTS for files in arch/s390/boot to avoid
unnecessary slowing down of the build and confusing external kABI
tools that process symtypes data
- Print additional unrecoverable machine check information to make the
root cause analysis easier
- Move cmpxchg_user_key() handling to uaccess library code, since the
generated code is large anyway and there is no benefit if it is
inlined
- Fix a problem when cmpxchg_user_key() is executing a code with a
non-default key: if a system is IPL-ed with "LOAD NORMAL", and the
previous system used storage keys where the fetch-protection bit was
set for some pages, and the cmpxchg_user_key() is located within such
page, a protection exception happens
- Either the external call or emergency signal order is used to send an
IPI to a remote CPU. Use the external order only, since it is at
least as good and sometimes even better, than the emergency signal
- In case of an early crash the early program check handler prints more
or less random value of the last breaking event address, since it is
not initialized properly. Copy the last breaking event address from
the lowcore to pt_regs to address this
- During STP synchronization check udelay() can not be used, since the
first CPU modifies tod_clock_base and get_tod_clock_monotonic() might
return a non-monotonic time. Instead, busy-loop on other CPUs, while
the the first CPU actually handles the synchronization operation
- When debugging the early kernel boot using QEMU with the -S flag and
GDB attached, skip the decompressor and start directly in kernel
- Rename PAI Crypto event 4210 according to z16 and z17 "z/Architecture
Principles of Operation" manual
- Remove the in-kernel time steering support in favour of the new s390
PTP driver, which allows the kernel clock steered more precisely
- Remove a possible false-positive warning in pte_free_defer(), which
could be triggered in a valid case KVM guest process is initializing
* tag 's390-6.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits)
s390/mm: Remove possible false-positive warning in pte_free_defer()
s390/stp: Default to enabled
s390/stp: Remove leap second support
s390/time: Remove in-kernel time steering
s390/sclp: Use monotonic clock in sclp_sync_wait()
s390/smp: Use monotonic clock in smp_emergency_stop()
s390/time: Use monotonic clock in get_cycles()
s390/pai_crypto: Rename PAI Crypto event 4210
scripts/gdb/symbols: make lx-symbols skip the s390 decompressor
s390/boot: Introduce jump_to_kernel() function
s390/stp: Remove udelay from stp_sync_clock()
s390/early: Copy last breaking event address to pt_regs
s390/smp: Remove conditional emergency signal order code usage
s390/uaccess: Merge cmpxchg_user_key() inline assemblies
s390/uaccess: Prevent kprobes on cmpxchg_user_key() functions
s390/uaccess: Initialize code pages executed with non-default access key
s390/skey: Provide infrastructure for executing with non-default access key
s390/uaccess: Make cmpxchg_user_key() library code
s390/page: Add memory clobber to page_set_storage_key()
s390/page: Cleanup page_set_storage_key() inline assemblies
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library test updates from Eric Biggers:
"Add KUnit test suites for the Poly1305, SHA-1, SHA-224, SHA-256,
SHA-384, and SHA-512 library functions.
These are the first KUnit tests for lib/crypto/. So in addition to
being useful tests for these specific algorithms, they also establish
some conventions for lib/crypto/ testing going forwards.
The new tests are fairly comprehensive: more comprehensive than the
generic crypto infrastructure's tests. They use a variety of
techniques to check for the types of implementation bugs that tend to
occur in the real world, rather than just naively checking some test
vectors. (Interestingly, poly1305_kunit found a bug in QEMU)
The core test logic is shared by all six algorithms, rather than being
duplicated for each algorithm.
Each algorithm's test suite also optionally includes a benchmark"
* tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
lib/crypto: tests: Annotate worker to be on stack
lib/crypto: tests: Add KUnit tests for SHA-1 and HMAC-SHA1
lib/crypto: tests: Add KUnit tests for Poly1305
lib/crypto: tests: Add KUnit tests for SHA-384 and SHA-512
lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256
lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py
|
|
The following warning traceback is seen if object debugging is enabled
with the new crypto test code.
ODEBUG: object 9000000106237c50 is on stack 9000000106234000, but NOT annotated.
------------[ cut here ]------------
WARNING: lib/debugobjects.c:655 at lookup_object_or_alloc.part.0+0x19c/0x1f4, CPU#0: kunit_try_catch/468
...
This also results in a boot stall when running the code in qemu:loongarch.
Initializing the worker with INIT_WORK_ONSTACK() fixes the problem.
Fixes: 950a81224e8b ("lib/crypto: tests: Add hash-test-template.h and gen-hash-testvecs.py")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250721231917.3182029-1-linux@roeck-us.net
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Now that the oldest supported binutils version is 2.30, the macros that
emit the SHA-512 instructions as '.inst' words are no longer needed. So
drop them. No change in the generated machine code.
Changed from the original patch by Ard Biesheuvel:
(https://lore.kernel.org/r/20250515142702.2592942-2-ardb+git@google.com):
- Reduced scope to just SHA-512
- Added comment that explains why "sha3" is used instead of "sha2"
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718220706.475240-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
The assembly code that does all 80 rounds of SHA-1 is highly repetitive.
Replace it with 20 expansions of a macro that does 4 rounds, using the
macro arguments and .if directives to handle the slight variations
between rounds. This reduces the length of sha1-ni-asm.S by 129 lines
while still producing the exact same object file. This mirrors
sha256-ni-asm.S which uses this same strategy.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
- Store the previous state in %xmm8-%xmm9 instead of spilling it to the
stack. There are plenty of unused XMM registers here, so there is no
reason to spill to the stack. (While 32-bit code is limited to
%xmm0-%xmm7, this is 64-bit code, so it's free to use %xmm8-%xmm15.)
- Remove the unnecessary check for nblocks == 0. sha1_ni_transform() is
always passed a positive nblocks.
- To get an XMM register with 'e' in the high dword and the rest zeroes,
just zeroize the register using pxor, then load 'e'. Previously the
code loaded 'e', then zeroized the lower dwords by AND-ing with a
constant, which was slightly less efficient.
- Instead of computing &DATA_PTR[NBLOCKS << 6] and stopping when
DATA_PTR reaches that value, instead just decrement NBLOCKS on each
iteration and stop when it reaches 0. This is fewer instructions.
- Rename DIGEST_PTR to STATE_PTR. It points to the SHA-1 internal
state, not a SHA-1 digest value.
This commit shrinks the code size of sha1_ni_transform() from 624 bytes
to 589 bytes and also shrinks rodata by 16 bytes.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250718191900.42877-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add a KUnit test suite for the SHA-1 library functions, including the
corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suite, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add a KUnit test suite for the Poly1305 functions. Most of its test
cases are instantiated from hash-test-template.h, which is also used by
the SHA-2 tests. A couple additional test cases are also included to
test edge cases specific to Poly1305.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add KUnit test suites for the SHA-384 and SHA-512 library functions,
including the corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add KUnit test suites for the SHA-224 and SHA-256 library functions,
including the corresponding HMAC support. The core test logic is in the
previously-added hash-test-template.h. This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add hash-test-template.h which generates the following KUnit test cases
for hash functions:
test_hash_test_vectors
test_hash_all_lens_up_to_4096
test_hash_incremental_updates
test_hash_buffer_overruns
test_hash_overlaps
test_hash_alignment_consistency
test_hash_ctx_zeroization
test_hash_interrupt_context_1
test_hash_interrupt_context_2
test_hmac (when HMAC is supported)
benchmark_hash (when CONFIG_CRYPTO_LIB_BENCHMARK=y)
The initial use cases for this will be sha224_kunit, sha256_kunit,
sha384_kunit, sha512_kunit, and poly1305_kunit.
Add a Python script gen-hash-testvecs.py which generates the test
vectors required by test_hash_test_vectors,
test_hash_all_lens_up_to_4096, and test_hmac.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the x86-optimized SHA-1 code via x86-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be x86-optimized, and it fixes the longstanding issue where
the x86-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t. The assembly functions actually
already treated it as size_t.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the sparc-optimized SHA-1 code via sparc-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be sparc-optimized, and it fixes the longstanding issue where
the sparc-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Note: to see the diff from arch/sparc/crypto/sha1_glue.c to
lib/crypto/sparc/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the s390-optimized SHA-1 code via s390-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be s390-optimized, and it fixes the longstanding issue where
the s390-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the powerpc-optimized SHA-1 code via
powerpc-specific crypto_shash algorithms, instead just implement the
sha1_blocks() library function. This is much simpler, it makes the
SHA-1 library functions be powerpc-optimized, and it fixes the
longstanding issue where the powerpc-optimized SHA-1 code was disabled
by default. SHA-1 still remains available through crypto_shash, but
individual architectures no longer need to handle it.
Note: to see the diff from arch/powerpc/crypto/sha1-spe-glue.c to
lib/crypto/powerpc/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the mips-optimized SHA-1 code via mips-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be mips-optimized, and it fixes the longstanding issue where
the mips-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-sha1.c
to lib/crypto/mips/sha1.h, view this commit with 'git show -M10'.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the arm64-optimized SHA-1 code via arm64-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be arm64-optimized, and it fixes the longstanding issue where
the arm64-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
Remove support for SHA-1 finalization from assembly code, since the
library does not yet support architecture-specific overrides of the
finalization. (Support for that has been omitted for now, for
simplicity and because usually it isn't performance-critical.)
To match sha1_blocks(), change the type of the nblocks parameter and the
return value of __sha1_ce_transform() from int to size_t. Update the
assembly code accordingly.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Instead of exposing the arm-optimized SHA-1 code via arm-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function. This is much simpler, it makes the SHA-1 library
functions be arm-optimized, and it fixes the longstanding issue where
the arm-optimized SHA-1 code was disabled by default. SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.
To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t. The assembly functions actually
already treated it as size_t.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add HMAC support to the SHA-1 library, again following what was done for
SHA-2. Besides providing the basis for a more streamlined "hmac(sha1)"
shash, this will also be useful for multiple in-kernel users such as
net/sctp/auth.c, net/ipv6/seg6_hmac.c, and
security/keys/trusted-keys/trusted_tpm1.c. Those are currently using
crypto_shash, but using the library functions would be much simpler.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add a library interface for SHA-1, following the SHA-2 one. As was the
case with SHA-2, this will be useful for various in-kernel users. The
crypto_shash interface will be reimplemented on top of it as well.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Rename the existing sha1_init() to sha1_init_raw(), since it conflicts
with the upcoming library function. This will later be removed, but
this keeps the kernel building for the introduction of the library.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
While the HMAC library functions support both incremental and one-shot
computation and both prepared and raw keys, the combination of raw key
+ incremental was missing. It turns out that several potential users of
the HMAC library functions (tpm2-sessions.c, smb2transport.c,
trusted_tpm1.c) want exactly that.
Therefore, add the missing functions hmac_sha*_init_usingrawkey().
Implement them in an optimized way that directly initializes the HMAC
context without a separate key preparation step.
Reimplement the one-shot raw key functions hmac_sha*_usingrawkey() on
top of the new functions, which makes them a bit more efficient.
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250711215844.41715-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Fix poly1305-armv4.pl to not do '.globl poly1305_blocks_neon' when
poly1305_blocks_neon() is not defined. Then, remove the empty __weak
definition of poly1305_blocks_neon(), which was still needed only
because of that unnecessary globl statement. (It also used to be needed
because the compiler could generate calls to it when
CONFIG_KERNEL_MODE_NEON=n, but that has been fixed.)
Thanks to Arnd Bergmann for reporting that the globl statement in the
asm file was still depending on the weak symbol.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250711212822.6372-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Restore the len >= 288 condition on using the AVX implementation, which
was incidentally removed by commit 318c53ae02f2 ("crypto: x86/poly1305 -
Add block-only interface"). This check took into account the overhead
in key power computation, kernel-mode "FPU", and tail handling
associated with the AVX code. Indeed, restoring this check slightly
improves performance for len < 256 as measured using poly1305_kunit on
an "AMD Ryzen AI 9 365" (Zen 5) CPU:
Length Before After
====== ========== ==========
1 30 MB/s 36 MB/s
16 516 MB/s 598 MB/s
64 1700 MB/s 1882 MB/s
127 2265 MB/s 2651 MB/s
128 2457 MB/s 2827 MB/s
200 2702 MB/s 3238 MB/s
256 3841 MB/s 3768 MB/s
511 4580 MB/s 4585 MB/s
512 5430 MB/s 5398 MB/s
1024 7268 MB/s 7305 MB/s
3173 8999 MB/s 8948 MB/s
4096 9942 MB/s 9921 MB/s
16384 10557 MB/s 10545 MB/s
While the optimal threshold for this CPU might be slightly lower than
288 (see the len == 256 case), other CPUs would need to be tested too,
and these sorts of benchmarks can underestimate the true cost of
kernel-mode "FPU". Therefore, for now just restore the 288 threshold.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Restore the SIMD usability check that was removed by commit a59e5468a921
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Restore the SIMD usability check that was removed by commit 773426f4771b
("crypto: arm/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 773426f4771b ("crypto: arm/poly1305 - Add block-only interface")
Cc: stable@vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
crypto/hash_info.c just contains a couple of arrays that map HASH_ALGO_*
algorithm IDs to properties of those algorithms. It is compiled only
when CRYPTO_HASH_INFO=y, but currently CRYPTO_HASH_INFO depends on
CRYPTO. Since this can be useful without the old-school crypto API,
move it into lib/crypto/ so that it no longer depends on CRYPTO.
This eliminates the need for FS_VERITY to select CRYPTO after it's been
converted to use lib/crypto/.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630172224.46909-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Since sha256_blocks() is called only with nblocks >= 1, remove
unnecessary checks for nblocks == 0 from the x86 SHA-256 assembly code.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250704023958.73274-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
As I did for sha512_blocks(), reorganize x86's sha256_blocks() to be
just a static_call. To achieve that, for each assembly function add a C
function that handles the kernel-mode FPU section and fallback. While
this increases total code size slightly, the amount of code actually
executed on a given system does not increase, and it is slightly more
efficient since it eliminates the extra static_key. It also makes the
assembly functions be called with standard direct calls instead of
static calls, eliminating the need for ANNOTATE_NOENDBR.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250704023958.73274-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
The BLOCK_HASH_UPDATE_BLOCKS macro is difficult to read. For now, let's
just write the update explicitly in the straightforward way, mirroring
sha512_update(). It's possible that we'll bring back a macro for this
later, but it needs to be properly justified and hopefully a bit more
readable.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Consolidate the CPU-based SHA-256 code into a single module, following
what I did with SHA-512:
- Each arch now provides a header file lib/crypto/$(SRCARCH)/sha256.h,
replacing lib/crypto/$(SRCARCH)/sha256.c. The header defines
sha256_blocks() and optionally sha256_mod_init_arch(). It is included
by lib/crypto/sha256.c, and thus the code gets built into the single
libsha256 module, with proper inlining and dead code elimination.
- sha256_blocks_generic() is moved from lib/crypto/sha256-generic.c into
lib/crypto/sha256.c. It's now a static function marked with
__maybe_unused, so the compiler automatically eliminates it in any
cases where it's not used.
- Whether arch-optimized SHA-256 is buildable is now controlled
centrally by lib/crypto/Kconfig instead of by
lib/crypto/$(SRCARCH)/Kconfig. The conditions for enabling it remain
the same as before, and it remains enabled by default.
- Any additional arch-specific translation units for the optimized
SHA-256 code (such as assembly files) are now compiled by
lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|