| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"Lock debugging:
- Implement compiler-driven static analysis locking context checking,
using the upcoming Clang 22 compiler's context analysis features
(Marco Elver)
We removed Sparse context analysis support, because prior to
removal even a defconfig kernel produced 1,700+ context tracking
Sparse warnings, the overwhelming majority of which are false
positives. On an allmodconfig kernel the number of false positive
context tracking Sparse warnings grows to over 5,200... On the plus
side of the balance actual locking bugs found by Sparse context
analysis is also rather ... sparse: I found only 3 such commits in
the last 3 years. So the rate of false positives and the
maintenance overhead is rather high and there appears to be no
active policy in place to achieve a zero-warnings baseline to move
the annotations & fixers to developers who introduce new code.
Clang context analysis is more complete and more aggressive in
trying to find bugs, at least in principle. Plus it has a different
model to enabling it: it's enabled subsystem by subsystem, which
results in zero warnings on all relevant kernel builds (as far as
our testing managed to cover it). Which allowed us to enable it by
default, similar to other compiler warnings, with the expectation
that there are no warnings going forward. This enforces a
zero-warnings baseline on clang-22+ builds (Which are still limited
in distribution, admittedly)
Hopefully the Clang approach can lead to a more maintainable
zero-warnings status quo and policy, with more and more subsystems
and drivers enabling the feature. Context tracking can be enabled
for all kernel code via WARN_CONTEXT_ANALYSIS_ALL=y (default
disabled), but this will generate a lot of false positives.
( Having said that, Sparse support could still be added back,
if anyone is interested - the removal patch is still
relatively straightforward to revert at this stage. )
Rust integration updates: (Alice Ryhl, Fujita Tomonori, Boqun Feng)
- Add support for Atomic<i8/i16/bool> and replace most Rust native
AtomicBool usages with Atomic<bool>
- Clean up LockClassKey and improve its documentation
- Add missing Send and Sync trait implementation for SetOnce
- Make ARef Unpin as it is supposed to be
- Add __rust_helper to a few Rust helpers as a preparation for
helper LTO
- Inline various lock related functions to avoid additional function
calls
WW mutexes:
- Extend ww_mutex tests and other test-ww_mutex updates (John
Stultz)
Misc fixes and cleanups:
- rcu: Mark lockdep_assert_rcu_helper() __always_inline (Arnd
Bergmann)
- locking/local_lock: Include more missing headers (Peter Zijlstra)
- seqlock: fix scoped_seqlock_read kernel-doc (Randy Dunlap)
- rust: sync: Replace `kernel::c_str!` with C-Strings (Tamir
Duberstein)"
* tag 'locking-core-2026-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (90 commits)
locking/rwlock: Fix write_trylock_irqsave() with CONFIG_INLINE_WRITE_TRYLOCK
rcu: Mark lockdep_assert_rcu_helper() __always_inline
compiler-context-analysis: Remove __assume_ctx_lock from initializers
tomoyo: Use scoped init guard
crypto: Use scoped init guard
kcov: Use scoped init guard
compiler-context-analysis: Introduce scoped init guards
cleanup: Make __DEFINE_LOCK_GUARD handle commas in initializers
seqlock: fix scoped_seqlock_read kernel-doc
tools: Update context analysis macros in compiler_types.h
rust: sync: Replace `kernel::c_str!` with C-Strings
rust: sync: Inline various lock related methods
rust: helpers: Move #define __rust_helper out of atomic.c
rust: wait: Add __rust_helper to helpers
rust: time: Add __rust_helper to helpers
rust: task: Add __rust_helper to helpers
rust: sync: Add __rust_helper to helpers
rust: refcount: Add __rust_helper to helpers
rust: rcu: Add __rust_helper to helpers
rust: processor: Add __rust_helper to helpers
...
|
|
Pull bitmap updates from Yury Norov:
- more rust helpers (Alice)
- more bitops tests (Ryota)
- FIND_NTH_BIT() uninitialized variable fix (Lee Yongjun)
- random cleanups (Andy, H. Peter)
* tag 'bitmap-for-6.20' of https://github.com/norov/linux:
lib/tests: extend KUnit test for bitops with more cases
bitops: Add more files to the MAINTAINERS
lib/find_bit: fix uninitialized variable use in FIND_NTH_BIT
lib/tests: add KUnit test for bitops
rust: cpumask: add __rust_helper to helpers
rust: bitops: add __rust_helper to helpers
rust: bitmap: add __rust_helper to helpers
linux/bitfield.h: replace __auto_type with auto
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
- Support associating BPF program with struct_ops (Amery Hung)
- Switch BPF local storage to rqspinlock and remove recursion detection
counters which were causing false positives (Amery Hung)
- Fix live registers marking for indirect jumps (Anton Protopopov)
- Introduce execution context detection BPF helpers (Changwoo Min)
- Improve verifier precision for 32bit sign extension pattern
(Cupertino Miranda)
- Optimize BTF type lookup by sorting vmlinux BTF and doing binary
search (Donglin Peng)
- Allow states pruning for misc/invalid slots in iterator loops (Eduard
Zingerman)
- In preparation for ASAN support in BPF arenas teach libbpf to move
global BPF variables to the end of the region and enable arena kfuncs
while holding locks (Emil Tsalapatis)
- Introduce support for implicit arguments in kfuncs and migrate a
number of them to new API. This is a prerequisite for cgroup
sub-schedulers in sched-ext (Ihor Solodrai)
- Fix incorrect copied_seq calculation in sockmap (Jiayuan Chen)
- Fix ORC stack unwind from kprobe_multi (Jiri Olsa)
- Speed up fentry attach by using single ftrace direct ops in BPF
trampolines (Jiri Olsa)
- Require frozen map for calculating map hash (KP Singh)
- Fix lock entry creation in TAS fallback in rqspinlock (Kumar
Kartikeya Dwivedi)
- Allow user space to select cpu in lookup/update operations on per-cpu
array and hash maps (Leon Hwang)
- Make kfuncs return trusted pointers by default (Matt Bobrowski)
- Introduce "fsession" support where single BPF program is executed
upon entry and exit from traced kernel function (Menglong Dong)
- Allow bpf_timer and bpf_wq use in all programs types (Mykyta
Yatsenko, Andrii Nakryiko, Kumar Kartikeya Dwivedi, Alexei
Starovoitov)
- Make KF_TRUSTED_ARGS the default for all kfuncs and clean up their
definition across the tree (Puranjay Mohan)
- Allow BPF arena calls from non-sleepable context (Puranjay Mohan)
- Improve register id comparison logic in the verifier and extend
linked registers with negative offsets (Puranjay Mohan)
- In preparation for BPF-OOM introduce kfuncs to access memcg events
(Roman Gushchin)
- Use CFI compatible destructor kfunc type (Sami Tolvanen)
- Add bitwise tracking for BPF_END in the verifier (Tianci Cao)
- Add range tracking for BPF_DIV and BPF_MOD in the verifier (Yazhou
Tang)
- Make BPF selftests work with 64k page size (Yonghong Song)
* tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (268 commits)
selftests/bpf: Fix outdated test on storage->smap
selftests/bpf: Choose another percpu variable in bpf for btf_dump test
selftests/bpf: Remove test_task_storage_map_stress_lookup
selftests/bpf: Update task_local_storage/task_storage_nodeadlock test
selftests/bpf: Update task_local_storage/recursion test
selftests/bpf: Update sk_storage_omem_uncharge test
bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy}
bpf: Support lockless unlink when freeing map or local storage
bpf: Prepare for bpf_selem_unlink_nofail()
bpf: Remove unused percpu counter from bpf_local_storage_map_free
bpf: Remove cgroup local storage percpu counter
bpf: Remove task local storage percpu counter
bpf: Change local_storage->lock and b->lock to rqspinlock
bpf: Convert bpf_selem_unlink to failable
bpf: Convert bpf_selem_link_map to failable
bpf: Convert bpf_selem_unlink_map to failable
bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage
selftests/xsk: fix number of Tx frags in invalid packet
selftests/xsk: properly handle batch ending in the middle of a packet
bpf: Prevent reentrance into call_rcu_tasks_trace()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"Mostly small cleanups and various scattered annotations and flex array
warning fixes that we reviewed by unlanded in other trees. Introduces
new annotation for expanding counted_by to pointer members, now that
compiler behavior between GCC and Clang has been normalized.
- Various missed __counted_by annotations (Thorsten Blum)
- Various missed -Wflex-array-member-not-at-end fixes (Gustavo A. R.
Silva)
- Avoid leftover tempfiles for interrupted compile-time FORTIFY tests
(Nicolas Schier)
- Remove non-existant CONFIG_UBSAN_REPORT_FULL from docs (Stefan
Wiehler)
- fortify: Use C arithmetic not FIELD_xxx() in FORTIFY_REASON defines
(David Laight)
- Add __counted_by_ptr attribute, tests, and first user (Bill
Wendling, Kees Cook)
- Update MAINTAINERS file to make hardening section not include
pstore"
* tag 'hardening-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
MAINTAINERS: pstore: Remove L: entry
nfp: tls: Avoid -Wflex-array-member-not-at-end warnings
carl9170: Avoid -Wflex-array-member-not-at-end warning
coredump: Use __counted_by_ptr for struct core_name::corename
lkdtm/bugs: Add __counted_by_ptr() test PTR_BOUNDS
compiler_types.h: Attributes: Add __counted_by_ptr macro
fortify: Cleanup temp file also on non-successful exit
fortify: Rename temporary file to match ignore pattern
fortify: Use C arithmetic not FIELD_xxx() in FORTIFY_REASON defines
ecryptfs: Annotate struct ecryptfs_message with __counted_by
fs/xattr: Annotate struct simple_xattr with __counted_by
crypto: af_alg - Annotate struct af_alg_iv with __counted_by
Kconfig.ubsan: Remove CONFIG_UBSAN_REPORT_FULL from documentation
drm/nouveau: fifo: Avoid -Wflex-array-member-not-at-end warning
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
- Add support for verifying ML-DSA signatures.
ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is a
recently-standardized post-quantum (quantum-resistant) signature
algorithm. It was known as Dilithium pre-standardization.
The first use case in the kernel will be module signing. But there
are also other users of RSA and ECDSA signatures in the kernel that
might want to upgrade to ML-DSA eventually.
- Improve the AES library:
- Make the AES key expansion and single block encryption and
decryption functions use the architecture-optimized AES code.
Enable these optimizations by default.
- Support preparing an AES key for encryption-only, using about
half as much memory as a bidirectional key.
- Replace the existing two generic implementations of AES with a
single one.
- Simplify how Adiantum message hashing is implemented. Remove the
"nhpoly1305" crypto_shash in favor of direct lib/crypto/ support for
NH hashing, and enable optimizations by default.
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (53 commits)
lib/crypto: mldsa: Clarify the documentation for mldsa_verify() slightly
lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox
lib/crypto: aes: Remove old AES en/decryption functions
lib/crypto: aesgcm: Use new AES library API
lib/crypto: aescfb: Use new AES library API
crypto: omap - Use new AES library API
crypto: inside-secure - Use new AES library API
crypto: drbg - Use new AES library API
crypto: crypto4xx - Use new AES library API
crypto: chelsio - Use new AES library API
crypto: ccp - Use new AES library API
crypto: x86/aes-gcm - Use new AES library API
crypto: arm64/ghash - Use new AES library API
crypto: arm/ghash - Use new AES library API
staging: rtl8723bs: core: Use new AES library API
net: phy: mscc: macsec: Use new AES library API
chelsio: Use new AES library API
Bluetooth: SMP: Use new AES library API
crypto: x86/aes - Remove the superseded AES-NI crypto_cipher
lib/crypto: x86/aes: Add AES-NI optimization
...
|
|
debugobjects uses __GFP_HIGH for allocations as it might be invoked
within locked regions. That worked perfectly fine until v6.18. It still
works correctly when deferred page initialization is disabled and works
by chance when no page allocation is required before deferred page
initialization has completed.
Since v6.18 allocations w/o a reclaim flag cause new_slab() to end up in
alloc_frozen_pages_nolock_noprof(), which returns early when deferred
page initialization has not yet completed. As the deferred page
initialization takes quite a while the debugobject pool is depleted and
debugobjects are disabled.
This can be worked around when PREEMPT_COUNT is enabled as that allows
debugobjects to add __GFP_KSWAPD_RECLAIM to the GFP flags when the context
is preemtible. When PREEMPT_COUNT is disabled the context is unknown and
the reclaim bit can't be set because the caller might hold locks which
might deadlock in the allocator.
In preemptible context the reclaim bit is harmless and not a performance
issue as that's usually invoked from slow path initialization context.
That makes debugobjects depend on PREEMPT_COUNT || !DEFERRED_STRUCT_PAGE_INIT.
Fixes: af92793e52c3 ("slab: Introduce kmalloc_nolock() and kfree_nolock().")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://patch.msgid.link/87pl6gznti.ffs@tglx
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull bounce buffer dio for stable pages from Jens Axboe:
"This adds support for bounce buffering of dio for stable pages. This
was all done by Christoph. In his words:
This series tries to address the problem that under I/O pages can be
modified during direct I/O, even when the device or file system
require stable pages during I/O to calculate checksums, parity or data
operations. It does so by adding block layer helpers to bounce buffer
an iov_iter into a bio, then wires that up in iomap and ultimately
XFS.
The reason that the file system even needs to know about it, is
because reads need a user context to copy the data back, and the
infrastructure to defer ioends to a workqueue currently sits in XFS.
I'm going to look into moving that into ioend and enabling it for
other file systems. Additionally btrfs already has it's own
infrastructure for this, and actually an urgent need to bounce buffer,
so this should be useful there and could be wire up easily. In fact
the idea comes from patches by Qu that did this in btrfs.
This patch fixes all but one xfstests failures on T10 PI capable
devices (generic/095 seems to have issues with a mix of mmap and
splice still, I'm looking into that separately), and make qemu VMs
running Windows, or Linux with swap enabled fine on an XFS file on a
device using PI.
Performance numbers on my (not exactly state of the art) NVMe PI test
setup:
Sequential reads using io_uring, QD=16.
Bandwidth and CPU usage (usr/sys):
| size | zero copy | bounce |
+------+--------------------------+--------------------------+
| 4k | 1316MiB/s (12.65/55.40%) | 1081MiB/s (11.76/49.78%) |
| 64K | 3370MiB/s ( 5.46/18.20%) | 3365MiB/s ( 4.47/15.68%) |
| 1M | 3401MiB/s ( 0.76/23.05%) | 3400MiB/s ( 0.80/09.06%) |
+------+--------------------------+--------------------------+
Sequential writes using io_uring, QD=16.
Bandwidth and CPU usage (usr/sys):
| size | zero copy | bounce |
+------+--------------------------+--------------------------+
| 4k | 882MiB/s (11.83/33.88%) | 750MiB/s (10.53/34.08%) |
| 64K | 2009MiB/s ( 7.33/15.80%) | 2007MiB/s ( 7.47/24.71%) |
| 1M | 1992MiB/s ( 7.26/ 9.13%) | 1992MiB/s ( 9.21/19.11%) |
+------+--------------------------+--------------------------+
Note that the 64k read numbers look really odd to me for the baseline
zero copy case, but are reproducible over many repeated runs.
The bounce read numbers should further improve when moving the PI
validation to the file system and removing the double context switch,
which I have patches for that will sent out soon"
* tag 'for-7.0/block-stable-pages-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
xfs: use bounce buffering direct I/O when the device requires stable pages
iomap: add a flag to bounce buffer direct I/O
iomap: support ioends for direct reads
iomap: rename IOMAP_DIO_DIRTY to IOMAP_DIO_USER_BACKED
iomap: free the bio before completing the dio
iomap: share code between iomap_dio_bio_end_io and iomap_finish_ioend_direct
iomap: split out the per-bio logic from iomap_dio_bio_iter
iomap: simplify iomap_dio_bio_iter
iomap: fix submission side handling of completion side errors
block: add helpers to bounce buffer an iov_iter into bios
block: remove bio_release_page
iov_iter: extract a iov_iter_extract_bvecs helper from bio code
block: open code bio_add_page and fix handling of mismatching P2P ranges
block: refactor get_contig_folio_len
block: add a BIO_MAX_SIZE constant and use it
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:
"kunit:
- add __rust_helper to helpers
- fix up const mismatch in many assert functions
- fix up const mismatch in test_list_sort
- protect KUNIT_BINARY_STR_ASSERTION against ERR_PTR values
- respect KBUILD_OUTPUT env variable by default
- add bash completion
kunit tool:
- add test for nested test result reporting
- do not overwrite test status based on subtest counts
- add 32-bit big endian ARM configuration to qemu_configs
- rename test_data_path() to _test_data_path()
- do not rely on implicit working directory change"
* tag 'linux_kselftest-kunit-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: add bash completion
kunit: tool: test: Don't rely on implicit working directory change
kunit: tool: test: Rename test_data_path() to _test_data_path()
kunit: qemu_configs: Add 32-bit big endian ARM configuration
kunit: tool: Don't overwrite test status based on subtest counts
kunit: tool: Add test for nested test result reporting
kunit: respect KBUILD_OUTPUT env variable by default
kunit: Protect KUNIT_BINARY_STR_ASSERTION against ERR_PTR values
test_list_sort: fix up const mismatch
kunit: fix up const mis-match in many assert functions
rust: kunit: add __rust_helper to helpers
|
|
Extend a KUnit test suite for the bitops API to cover more APIs from
include/asm-generic/bitops/instrumented-atomic.h.
- change_bit()
- test_and_set_bit()
- test_and_clear_bit()
- test_and_change_bit()
Verified on x86_64, i386, and arm64 architectures.
Sample KUnit output:
KTAP version 1
# Subtest: test_change_bit
ok 1 BITOPS_4
ok 2 BITOPS_7
ok 3 BITOPS_11
ok 4 BITOPS_31
ok 5 BITOPS_88
# test_change_bit: pass:5 fail:0 skip:0 total:5
ok 2 test_change_bit
KTAP version 1
# Subtest: test_test_and_set_bit_test_and_clear_bit
ok 1 BITOPS_4
ok 2 BITOPS_7
ok 3 BITOPS_11
ok 4 BITOPS_31
ok 5 BITOPS_88
# test_test_and_set_bit_test_and_clear_bit: pass:5 fail:0 skip:0 total:5
ok 3 test_test_and_set_bit_test_and_clear_bit
KTAP version 1
# Subtest: test_test_and_change_bit
ok 1 BITOPS_4
ok 2 BITOPS_7
ok 3 BITOPS_11
ok 4 BITOPS_31
ok 5 BITOPS_88
# test_test_and_change_bit: pass:5 fail:0 skip:0 total:5
ok 4 test_test_and_change_bit
Signed-off-by: Ryota Sakamoto <sakamo.ryota@gmail.com>
Signed-off-by: Yury Norov <ynorov@nvidia.com>
|
|
In the FIND_NTH_BIT macro, if the 'size' parameter is 0, both the
loop conditions and the modulo condition are not met. Consequently,
the 'tmp' variable remains uninitialized before being used in the
'found' label.
This results in the following smatch errors:
lib/find_bit.c:164 __find_nth_bit() error: uninitialized symbol 'tmp'.
lib/find_bit.c:171 __find_nth_and_bit() error: uninitialized symbol 'tmp'.
lib/find_bit.c:178 __find_nth_andnot_bit() error: uninitialized symbol 'tmp'.
lib/find_bit.c:187 __find_nth_and_andnot_bit() error: uninitialized symbol 'tmp'.
Initialize 'tmp' to 0 to ensure that fns() operates on a zeroed value
(no bits set) when size is 0, preventing the use of garbage values.
[Yury: size == 0 is generally a sign of error on client side, and in
this case, any returned value is OK because the returned value would be
greater than 'size'. Applying the patch to reduce the checker noise.]
Signed-off-by: Lee Yongjun <jun85566@gmail.com>
Signed-off-by: Yury Norov <ynorov@nvidia.com>
|
|
Add a KUnit test suite for the bitops API.
The existing 'lib/test_bitops.c' is preserved as-is because it contains
ad-hoc micro-benchmarks 'test_fns' and is intended to ensure no compiler
warnings from C=1 sparse checker or -Wextra compilations.
Introduce 'lib/tests/bitops_kunit.c' for functional regression testing. It
ports the test logic and data patterns from 'lib/test_bitops.c' to KUnit,
verifying correct behavior across various input patterns and
architecture-specific edge cases using isolated stack-allocated bitmaps.
The following test logic has been ported from test_bitops_startup() in
lib/test_bitops.c:
- set_bit() / clear_bit() / find_first_bit() validation ->
test_set_bit_clear_bit()
- get_count_order() validation -> test_get_count_order()
- get_count_order_long() validation -> test_get_count_order_long()
Also improve the find_first_bit() test to check the full bitmap length
(BITOPS_LENGTH) instead of omitting the last bit, ensuring the bitmap is
completely empty after cleanup.
Verified on x86_64, i386, and arm64 architectures.
Sample KUnit output:
KTAP version 1
# Subtest: bitops
# module: bitops_kunit
1..3
KTAP version 1
# Subtest: test_set_bit_clear_bit
ok 1 BITOPS_4
ok 2 BITOPS_7
ok 3 BITOPS_11
ok 4 BITOPS_31
ok 5 BITOPS_88
# test_set_bit_clear_bit: pass:5 fail:0 skip:0 total:5
ok 1 test_set_bit_clear_bit
KTAP version 1
# Subtest: test_get_count_order
ok 1 0x00000003
ok 2 0x00000004
ok 3 0x00001fff
ok 4 0x00002000
ok 5 0x50000000
ok 6 0x80000000
ok 7 0x80003000
# test_get_count_order: pass:7 fail:0 skip:0 total:7
ok 2 test_get_count_order
KTAP version 1
# Subtest: test_get_count_order_long
ok 1 0x0000000300000000
ok 2 0x0000000400000000
ok 3 0x00001fff00000000
ok 4 0x0000200000000000
ok 5 0x5000000000000000
ok 6 0x8000000000000000
ok 7 0x8000300000000000
# test_get_count_order_long: pass:7 fail:0 skip:0 total:7
ok 3 test_get_count_order_long
[Yury: trim Kconfig help message]
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Ryota Sakamoto <sakamo.ryota@gmail.com>
Signed-off-by: Yury Norov <ynorov@nvidia.com>
|
|
Introduce an in-kernel test module to validate the core logic of the Live
Update Orchestrator's File-Lifecycle-Bound feature. This provides a
low-level, controlled environment to test FLB registration and callback
invocation without requiring userspace interaction or actual kexec
reboots.
The test is enabled by the CONFIG_LIVEUPDATE_TEST Kconfig option.
Link: https://lkml.kernel.org/r/20251218155752.3045808-6-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: David Gow <davidgow@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Cc: Tamir Duberstein <tamird@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a KUnit test suite for the new private list primitives.
The test defines a struct with a __private list_head and exercises every
macro defined in <linux/list_private.h>.
This ensures that the macros correctly handle the ACCESS_PRIVATE()
abstraction and compile without warnings when acting on private members,
verifying that qualifiers are stripped and offsets are calculated
correctly.
Link: https://lkml.kernel.org/r/20251218155752.3045808-3-pasha.tatashin@soleen.com
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: David Gow <davidgow@google.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: David Matlack <dmatlack@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <kees@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Samiullah Khawaja <skhawaja@google.com>
Cc: Tamir Duberstein <tamird@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Fix PROCMAP_QUERY to fetch optional build ID only after dropping mmap_lock
or per-VMA lock, whichever was used to lock VMA under question, to avoid
deadlock reported by syzbot:
-> #1 (&mm->mmap_lock){++++}-{4:4}:
__might_fault+0xed/0x170
_copy_to_iter+0x118/0x1720
copy_page_to_iter+0x12d/0x1e0
filemap_read+0x720/0x10a0
blkdev_read_iter+0x2b5/0x4e0
vfs_read+0x7f4/0xae0
ksys_read+0x12a/0x250
do_syscall_64+0xcb/0xf80
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #0 (&sb->s_type->i_mutex_key#8){++++}-{4:4}:
__lock_acquire+0x1509/0x26d0
lock_acquire+0x185/0x340
down_read+0x98/0x490
blkdev_read_iter+0x2a7/0x4e0
__kernel_read+0x39a/0xa90
freader_fetch+0x1d5/0xa80
__build_id_parse.isra.0+0xea/0x6a0
do_procmap_query+0xd75/0x1050
procfs_procmap_ioctl+0x7a/0xb0
__x64_sys_ioctl+0x18e/0x210
do_syscall_64+0xcb/0xf80
entry_SYSCALL_64_after_hwframe+0x77/0x7f
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
rlock(&mm->mmap_lock);
lock(&sb->s_type->i_mutex_key#8);
lock(&mm->mmap_lock);
rlock(&sb->s_type->i_mutex_key#8);
*** DEADLOCK ***
This seems to be exacerbated (as we haven't seen these syzbot reports
before that) by the recent:
777a8560fd29 ("lib/buildid: use __kernel_read() for sleepable context")
To make this safe, we need to grab file refcount while VMA is still locked, but
other than that everything is pretty straightforward. Internal build_id_parse()
API assumes VMA is passed, but it only needs the underlying file reference, so
just add another variant build_id_parse_file() that expects file passed
directly.
[akpm@linux-foundation.org: fix up kerneldoc]
Link: https://lkml.kernel.org/r/20260129215340.3742283-1-andrii@kernel.org
Fixes: ed5d583a88a9 ("fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reported-by: <syzbot+4e70c8e0a2017b432f7a@syzkaller.appspotmail.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.19-rc9).
No adjacent changes, conflicts:
drivers/net/ethernet/spacemit/k1_emac.c
3125fc1701694 ("net: spacemit: k1-emac: fix jumbo frame support")
f66086798f91f ("net: spacemit: Remove broken flow control support")
https://lore.kernel.org/aYIysFIE9ooavWia@sirena.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Terminate the value search for a key if it hits a newline and make
the value empty.
When we pass a bootconfig with an empty value terminated by the
newline, like below::
foo =
bar = value
Current bootconfig interprets it as a single entry::
foo = "bar = value";
The Documentation/admin-guide/bootconfig.rst defines the value
itself is terminated by newline:
The value has to be terminated by semi-colon (``;``) or newline (``\n``).
but it does not define when the value search is terminated.
This changes the behavior to be more line-oriented, so that it is
clearer in how it works.
- The value search of key-value pair will be terminated by a comment
or newline.
- The value search of an array will continue beyond comments and
newlines.
Thus, with this update, the above example is interpreted as::
foo = "";
bar = "value";
And the below example will cause a syntax error because "bar" is expected
as a key but it has ','.
foo =
bar, buz
According to this change, one wrong example config is updated.
Link: https://lore.kernel.org/all/177025238503.14982.17059549076175612447.stgit@devnote2/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Julius Werner <jwerner@chromium.org>
|
|
mldsa_verify() implements ML-DSA.Verify with ctx='', so document this
more explicitly. Remove the one-liner comment above mldsa_verify()
which was somewhat misleading.
Reviewed-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20260202221552.174341-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add a kernel config option to set the default value of
workqueue.panic_on_stall, similar to CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC,
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC and CONFIG_BOOTPARAM_HUNG_TASK_PANIC.
This allows setting the number of workqueue stalls before triggering
a kernel panic at build time, which is useful for high-availability
systems that need consistent panic-on-stall, in other words, those
servers which run with CONFIG_BOOTPARAM_*_PANIC=y already.
The default remains 0 (disabled). Setting it to 1 will panic on the
first stall, and higher values will panic after that many stall
warnings. The value can still be overridden at runtime via the
workqueue.panic_on_stall boot parameter or sysfs.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
required to merge "kho: use unsigned long for nr_pages".
|
|
Boot parameters prefixed with "sysctl." are processed during the final
stage of system initialization via kernel_init()-> do_sysctl_args(). When
CONFIG_MEM_ALLOC_PROFILING_DEBUG is enabled, the sysctl.vm.mem_profiling
entry is not writable and will cause a warning.
Before run_init_process(), system initialization executes in kernel thread
context. Use current->mm to distinguish sysctl writes during
do_sysctl_args() from user-space triggered ones.
And when the proc_handler is from do_sysctl_args(), always return success
because the same value was already set by setup_early_mem_profiling() and
this eliminates a permission denied warning.
Link: https://lkml.kernel.org/r/20260115031536.164254-1-ranxiaokai627@163.com
Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
swap: fix race of truncate and swap entry split", needed for merging "mm,
swap: cleanup swap entry management workflow".
|
|
Remove __assume_ctx_lock() from lock initializers.
Implicitly asserting an active context during initialization caused
false-positive double-lock errors when acquiring a lock immediately after its
initialization. Moving forward, guarded member initialization must either:
1. Use guard(type_init)(&lock) or scoped_guard(type_init, ...).
2. Use context_unsafe() for simple initialization.
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/57062131-e79e-42c2-aa0b-8f931cb8cac2@acm.org/
Link: https://patch.msgid.link/20260119094029.1344361-7-elver@google.com
|
|
Add scoped init guard definitions for common synchronization primitives
supported by context analysis.
The scoped init guards treat the context as active within initialization
scope of the underlying context lock, given initialization implies
exclusive access to the underlying object. This allows initialization of
guarded members without disabling context analysis, while documenting
initialization from subsequent usage.
The documentation is updated with the new recommendation. Where scoped
init guards are not provided or cannot be implemented (ww_mutex omitted
for lack of multi-arg guard initializers), the alternative is to just
disable context analysis where guarded members are initialized.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20251212095943.GM3911114@noisy.programming.kicks-ass.net/
Link: https://patch.msgid.link/20260119094029.1344361-3-elver@google.com
|
|
Massage __bio_iov_iter_get_pages so that it doesn't need the bio, and
move it to lib/iov_iter.c so that it can be used by block code for
other things than filling a bio and by other subsystems like netfs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This allows the compiler to check the arguments against the __printf()
attribute on __test(). This produces better diagnostics when incorrect
inputs are passed.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202512061600.89CKQ3ag-lkp@intel.com/
Signed-off-by: Tamir Duberstein <tamird@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20260121-printf-kunit-printf-attr-v3-1-4144f337ec8b@kernel.org
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
Now that there are no users of the low-level SHA-1 interface, remove it.
Specifically:
- Remove SHA1_DIGEST_WORDS (no longer used)
- Remove sha1_init_raw() (no longer used)
- Rename sha1_transform() to sha1_block_generic() and make it static
- Move SHA1_WORKSPACE_WORDS into lib/crypto/sha1.c
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20260123051656.396371-3-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As CPU core counts increase, the number of NVMe IRQs may be smaller than
the total number of CPUs. This forces multiple CPUs to share the same
IRQ. If the IRQ affinity and the CPU's cluster do not align, a
performance penalty can be observed on some platforms.
This patch improves IRQ affinity by grouping CPUs by cluster within each
NUMA domain, ensuring better locality between CPUs and their assigned NVMe
IRQs.
Details:
Intel Xeon E platform packs 4 CPU cores as 1 module (cluster) and share
the L2 cache. Let's say, if there are 40 CPUs in 1 NUMA domain and 11
IRQs to dispatch. The existing algorithm will map first 7 IRQs each with
4 CPUs and remained 4 IRQs each with 3 CPUs. The last 4 IRQs may have
cross cluster issue. For example, the 9th IRQ which pinned to CPU32, then
for CPU31, it will have cross L2 memory access.
CPU |28 29 30 31|32 33 34 35|36 ...
-------- -------- --------
IRQ 8 9 10
If this patch applied, then first 2 IRQs each mapped with 2 CPUs and rest
9 IRQs each mapped with 4 CPUs, which avoids the cross cluster memory
access.
CPU |00 01 02 03|04 05 06 07|08 09 10 11| ...
----- ----- ----------- -----------
IRQ 1 2 3 4
As a result, 15%+ performance difference is observed in FIO
libaio/randread/bs=8k.
Changes since V1:
- Add more performance details in commit messages.
- Fix endless loop when topology_cluster_cpumask return invalid mask.
History:
v1: https://lore.kernel.org/all/20251024023038.872616-1-wangyang.guo@intel.com/
v1 [RESEND]: https://lore.kernel.org/all/20251111020608.1501543-1-wangyang.guo@intel.com/
Link: https://lkml.kernel.org/r/20260113022958.3379650-1-wangyang.guo@intel.com
Signed-off-by: Wangyang Guo <wangyang.guo@intel.com>
Reviewed-by: Tianyou Li <tianyou.li@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Tested-by: Dan Liang <dan.liang@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@fb.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Radu Rendec <rrendec@redhat.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a new Kconfig symbol to make CONFIG_DEBUG_ATOMIC more useful on those
architectures which do not align dynamic allocations to 8-byte boundaries.
Without this, CONFIG_DEBUG_ATOMIC produces excessive WARN splats.
Link: https://lkml.kernel.org/r/6d25a12934fe9199332f4d65d17c17de450139a8.1768281748.git.fthain@linux-m68k.org
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Daniel Borkman <daniel@iogearbox.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Sasha Levin (Microsoft) <sashal@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a Kconfig option for debug builds which logs a warning when an
instrumented atomic operation takes place that's misaligned. Some
platforms don't trap for this.
[fthain@linux-m68k.org: added __DISABLE_EXPORTS conditional and refactored as helper function]
Link: https://lkml.kernel.org/r/51ebf844e006ca0de408f5d3a831e7b39d7fc31c.1768281748.git.fthain@linux-m68k.org
Link: https://lore.kernel.org/lkml/20250901093600.GF4067720@noisy.programming.kicks-ass.net/
Link: https://lore.kernel.org/linux-next/df9fbd22-a648-ada4-fee0-68fe4325ff82@linux-m68k.org/
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Borkman <daniel@iogearbox.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Guo Ren <guoren@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Rich Felker <dalias@libc.org>
Cc: Song Liu <song@kernel.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Pointless overhead to use a work queue to reset the static key for a
DO_ONCE_SLEEPABLE() invocation.
Note that the previous code path included a BUG_ON() if the static key
was already disabled. Dropped that as part of this change because:
1) Use of BUG_ON() is highly discouraged.
2) There is a WARN_ON() in the static_branch_disable() code path
that would provide adequate breadcrumbs to debug any issue.
Link: https://lkml.kernel.org/r/aWU4tfTju1l3oZCu@agluck-desk3
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
During the initialization phase, the test_kho module invokes the
kho_preserve_folio function, which internally configures bitmaps within
kho_mem_track and establishes chunk linked lists in KHO. Upon unloading
the test_kho module, it is necessary to clean up these states.
Link: https://lkml.kernel.org/r/20260107022427.4114424-1-longwei27@huawei.com
Signed-off-by: Long Wei <longwei27@huawei.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: hewenliang <hewenliang4@huawei.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This patch converts the existing glob selftest (lib/globtest.c) to use the
KUnit framework (lib/tests/glob_kunit.c).
The new test:
- Migrates all 64 test cases from the original test to the KUnit suite.
- Removes the custom 'verbose' module parameter as KUnit handles logging.
- Updates Kconfig.debug and Makefile to support the new KUnit test.
- Updates Kconfig and Makefile to remove the original selftest.
- Updates GLOB_SELFTEST to GLOB_KUNIT_TEST for arch/m68k/configs.
This commit is verified by `./tools/testing/kunit/kunit.py run'
with the .kunit/.kunitconfig:
CONFIG_KUNIT=y
CONFIG_GLOB_KUNIT_TEST=y
Link: https://lkml.kernel.org/r/20260108120753.27339-1-note351@hotmail.com
Signed-off-by: Kir Chou <note351@hotmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: <kirchou@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The comment for CONFIG_BOOTPARAM_HUNG_TASK_PANIC says:
Say N if unsure.
but since commit 9544f9e6947f ("hung_task: panic when there are more than
N hung tasks at the same time"), N is not a valid value for the option,
leading to a warning at build time:
.config:11736:warning: symbol value 'n' invalid for BOOTPARAM_HUNG_TASK_PANIC
as well as an error when given to menuconfig.
Fix the comment to say '0' instead of 'N'.
Link: https://lkml.kernel.org/r/20260106140140.136446-1-tglozar@redhat.com
Fixes: 9544f9e6947f ("hung_task: panic when there are more than N hung tasks at the same time")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Reported-by: Johnny Mnemonic <jm@machine-hall.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The `struct kho_vmalloc` defines the in-memory layout for preserving
vmalloc regions across kexec. This layout is a contract between kernels
and part of the KHO ABI.
To reflect this relationship, the related structs and helper macros are
relocated to the ABI header, `include/linux/kho/abi/kexec_handover.h`.
This move places the structure's definition under the protection of the
KHO_FDT_COMPATIBLE version string.
The structure and its components are now also documented within the ABI
header to describe the contract and prevent ABI breaks.
[rppt@kernel.org: update comment, per Pratyush]
Link: https://lkml.kernel.org/r/aW_Mqp6HcqLwQImS@kernel.org
Link: https://lkml.kernel.org/r/20260105165839.285270-6-rppt@kernel.org
Signed-off-by: Jason Miu <jasonmiu@google.com>
Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit ae5b3500856f ("kstrtox: add support for enabled and disabled in
kstrtobool()") added support for 'e'/'E' (enabled) and 'd'/'D' (disabled)
inputs, but did not update the docstring accordingly.
Update the docstring to include 'Ee' (for true) and 'Dd' (for false) in
the list of accepted first characters.
Link: https://lkml.kernel.org/r/20251227092229.57330-1-chaitanyamishra.ai@gmail.com
Fixes: ae5b3500856f ("kstrtox: add support for enabled and disabled in kstrtobool()")
Signed-off-by: Chaitanya Mishra <chaitanyamishra.ai@gmail.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Move lib/test_min_heap.c to lib/tests/min_heap_kunit.c and convert it to
use KUnit.
This change switches the ad-hoc test code to standard KUnit test cases.
The test data remains the same, but the verification logic is updated to
use KUNIT_EXPECT_* macros.
Also remove CONFIG_TEST_MIN_HEAP from arch/*/configs/* because it is no
longer used. The new CONFIG_MIN_HEAP_KUNIT_TEST will be automatically
enabled by CONFIG_KUNIT_ALL_TESTS.
The reasons for converting to KUnit are:
1. Standardization:
Switching from ad-hoc printk-based reporting to the standard
KTAP format makes it easier for CI systems to parse and report test
results
2. Better Diagnostics:
Using KUNIT_EXPECT_* macros automatically provides detailed
diagnostics on failure.
3. Tooling Integration:
It allows the test to be managed and executed using standard
KUnit tools.
Link: https://lkml.kernel.org/r/20251221133516.321846-1-sakamo.ryota@gmail.com
Signed-off-by: Ryota Sakamoto <sakamo.ryota@gmail.com>
Acked-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: David Gow <davidgow@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
To be consistent, pass the kmalloc_array_node() parameters in the order
(number_of_elements, element_size). Since only the product of the two
values is used, this is not a bug fix.
Link: https://lkml.kernel.org/r/20251220054541.2295599-1-rdunlap@infradead.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216015
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Stefani Seibold <stefani@seibold.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Reinitialize metadata for large zone device private folios in
zone_device_page_init prior to creating a higher-order zone device private
folio. This step is necessary when the folio's order changes dynamically
between zone_device_page_init calls to avoid building a corrupt folio. As
part of the metadata reinitialization, the dev_pagemap must be passed in
from the caller because the pgmap stored in the folio page may have been
overwritten with a compound head.
Without this fix, individual pages could have invalid pgmap fields and
flags (with PG_locked being notably problematic) due to prior different
order allocations, which can, and will, result in kernel crashes.
Link: https://lkml.kernel.org/r/20260116111325.1736137-2-francois.dugast@intel.com
Fixes: d245f9b4ab80 ("mm/zone_device: support large zone device private folios")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Balbir Singh <balbirs@nvidia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Bernd has reported a lockdep splat from flexible proportions code that is
essentially complaining about the following race:
<timer fires>
run_timer_softirq - we are in softirq context
call_timer_fn
writeout_period
fprop_new_period
write_seqcount_begin(&p->sequence);
<hardirq is raised>
...
blk_mq_end_request()
blk_update_request()
ext4_end_bio()
folio_end_writeback()
__wb_writeout_add()
__fprop_add_percpu_max()
if (unlikely(max_frac < FPROP_FRAC_BASE)) {
fprop_fraction_percpu()
seq = read_seqcount_begin(&p->sequence);
- sees odd sequence so loops indefinitely
Note that a deadlock like this is only possible if the bdi has configured
maximum fraction of writeout throughput which is very rare in general but
frequent for example for FUSE bdis. To fix this problem we have to make
sure write section of the sequence counter is irqsafe.
Link: https://lkml.kernel.org/r/20260121112729.24463-2-jack@suse.cz
Fixes: a91befde3503 ("lib/flex_proportions.c: remove local_irq_ops in fprop_new_period()")
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Bernd Schubert <bernd@bsbernd.com>
Link: https://lore.kernel.org/all/9b845a47-9aee-43dd-99bc-1a82bea00442@bsbernd.com/
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Joanne Koong <joannelkoong@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
With CONFIG_CC_OPTIMIZE_FOR_SIZE=y, GCC may decide not to inline
__cvdso_clock_getres_common(). This introduces spurious internal
function calls in the vDSO fastpath.
Furthermore, with automatic stack variable initialization
(CONFIG_INIT_STACK_ALL_ZERO or CONFIG_INIT_STACK_ALL_PATTERN) GCC can emit
a call to memset() which is not valid in the vDSO.
Mark __cvdso_clock_getres_common() as __always_inline to avoid both issues.
Paradoxically the inlining even reduces the size of the code:
$ ./scripts/bloat-o-meter arch/powerpc/kernel/vdso/vgettimeofday-32.o.before arch/powerpc/kernel/vdso/vgettimeofday-32.o.after
add/remove: 0/1 grow/shrink: 1/1 up/down: 52/-148 (-96)
Function old new delta
__c_kernel_clock_getres_time64 92 144 +52
__c_kernel_clock_getres 136 132 -4
__cvdso_clock_getres_common 144 - -144
Total: Before=2788, After=2692, chg -3.44%
With CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y the functions are always inlined
and therefore the behaviour stays the same.
See also the equivalent change for clock_gettime() in commit b91c8c42ffdd
("lib/vdso: Force inlining of __cvdso_clock_gettime_common()").
Fixes: 21bbfd74044f ("x86/vdso: Provide clock_getres_time64() for x86-32")
Fixes: 1149dcdfc9ef ("ARM: VDSO: Provide clock_getres_time64()")
Fixes: f10c2e72b5de ("arm64: vdso32: Provide clock_getres_time64()")
Fixes: bec06cd6a140 ("MIPS: vdso: Provide getres_time64() for 32-bit ABIs")
Fixes: 759a1f97373f ("powerpc/vdso: Provide clock_getres_time64()")
Reported-by: Sverdlin, Alexander <alexander.sverdlin@siemens.com>
Suggested-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://lore.kernel.org/lkml/230c749f-ebd6-4829-93ee-601d88000a45@kernel.org/
Link: https://patch.msgid.link/20260123-vdso-clock_getres-inline-v1-1-4d6203b90cd3@linutronix.de
Closes: https://lore.kernel.org/lkml/f45316f65a46da638b3c6aa69effd8980e6677b9.camel@siemens.com/
|
|
The softlockup_panic sysctl is currently a binary option: panic
immediately or never panic on soft lockups.
Panicking on any soft lockup, regardless of duration, can be overly
aggressive for brief stalls that may be caused by legitimate operations.
Conversely, never panicking may allow severe system hangs to persist
undetected.
Extend softlockup_panic to accept an integer threshold, allowing the
kernel to panic only when the normalized lockup duration exceeds N
watchdog threshold periods. This provides finer-grained control to
distinguish between transient delays and persistent system failures.
The accepted values are:
- 0: Don't panic (unchanged)
- 1: Panic when duration >= 1 * threshold (20s default, original behavior)
- N > 1: Panic when duration >= N * threshold (e.g., 2 = 40s, 3 = 60s.)
The original behavior is preserved for values 0 and 1, maintaining full
backward compatibility while allowing systems to tolerate brief lockups
while still catching severe, persistent hangs.
[lirongqing@baidu.com: v2]
Link: https://lkml.kernel.org/r/20251218074300.4080-1-lirongqing@baidu.com
Link: https://lkml.kernel.org/r/20251216074521.2796-1-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove <linux/hex.h> from <linux/kernel.h> and update all users/callers of
hex.h interfaces to directly #include <linux/hex.h> as part of the process
of putting kernel.h on a diet.
Removing hex.h from kernel.h means that 36K C source files don't have to
pay the price of parsing hex.h for the roughly 120 C source files that
need it.
This change has been build-tested with allmodconfig on most ARCHes. Also,
all users/callers of <linux/hex.h> in the entire source tree have been
updated if needed (if not already #included).
Link: https://lkml.kernel.org/r/20251215005206.2362276-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Move lib/test_uuid.c to lib/tests/uuid_kunit.c and convert it to use KUnit.
This change switches the ad-hoc test code to standard KUnit test cases.
The test data remains the same, but the verification logic is updated to
use KUNIT_EXPECT_* macros.
Also remove CONFIG_TEST_UUID from arch/*/configs/* because it is no longer
used. The new CONFIG_UUID_KUNIT_TEST will be automatically enabled by
CONFIG_KUNIT_ALL_TESTS.
[lukas.bulwahn@redhat.com: MAINTAINERS: adjust file entry in UUID HELPERS]
Link: https://lkml.kernel.org/r/20251217053907.2778515-1-lukas.bulwahn@redhat.com
Link: https://lkml.kernel.org/r/20251215134322.12949-1-sakamo.ryota@gmail.com
Signed-off-by: Ryota Sakamoto <sakamo.ryota@gmail.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: David Gow <davidgow@google.com>
Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The current OID registry parser uses 64 bit arithmetic which limits us to
supporting 64 bit or smaller OIDs. This isn't usually a problem except
that it prevents us from representing the 2.25. prefix OIDs which are the
OID representation of UUIDs and have a 128 bit number following the
prefix. Rather than import not often used perl arithmetic modules,
replace the current perl 64 bit arithmetic with a callout to bc, which is
arbitrary precision, for decimal to base 2 conversion, then do pure string
operations on the base 2 number.
[James.Bottomley@HansenPartnership.com: tidy up perl with better my placement also set bc to arbitrary size]
Link: https://lkml.kernel.org/r/dbc90c344c691ed988640a28367ff895b5ef2604.camel@HansenPartnership.com
Link: https://lkml.kernel.org/r/833c858cd74533203b43180208734b84f1137af0.camel@HansenPartnership.com
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Blaise Boscaccy <bboscaccy@linux.microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
If PAGE_SIZE is larger than 4k and if you have a system with a large
number of CPUs, this test can require a very large amount of memory
leading to oom-killer firing. Given the type of allocation, the kernel
won't have anything to kill, causing the system to stall.
Add a parameter to the test_vmalloc driver to represent the number of
times a percpu object will be allocated. Calculate this in
test_vmalloc.sh to be 90% of available memory or the current default of
35000, whichever is smaller.
Link: https://lkml.kernel.org/r/20251201181848.1216197-1-audra@redhat.com
Signed-off-by: Audra Mitchell <audra@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rafael Aquini <raquini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove the change in file mode permissions done before initializing the
sysctl. It is not necessary as the writing of the kernel variable will be
blocked by the proc_mem_profiling_handler when writing is disallowed (also
controlled by mem_profiling_support).
Link: https://lkml.kernel.org/r/20251215-jag-alloc_tag_const-v1-1-35ea56a1ce13@kernel.org
Signed-off-by: Joel Granados <joel.granados@kernel.org>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The volatile keyword is no longer necessary or useful on aes_sbox and
aes_inv_sbox, since the table prefetching is now done using a helper
function that casts to volatile itself and also includes an optimization
barrier. Since it prevents some compiler optimizations, remove it.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-36-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Now that all callers of the aes_encrypt() and aes_decrypt() type-generic
macros are using the new types, remove the old functions.
Then, replace the macro with direct calls to the new functions, dropping
the "_new" suffix from them.
This completes the change in the type of the key struct that is passed
to aes_encrypt() and aes_decrypt().
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-35-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey). This
eliminates the unnecessary computation and caching of the decryption
round keys. The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.
Note that in addition to the change in the key preparation function and
the key struct type itself, the change in the type of the key struct
results in aes_encrypt() (which is temporarily a type-generic macro)
calling the new encryption function rather than the old one.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-34-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Switch from the old AES library functions (which use struct
crypto_aes_ctx) to the new ones (which use struct aes_enckey). This
eliminates the unnecessary computation and caching of the decryption
round keys. The new AES en/decryption functions are also much faster
and use AES instructions when supported by the CPU.
Note that in addition to the change in the key preparation function and
the key struct type itself, the change in the type of the key struct
results in aes_encrypt() (which is temporarily a type-generic macro)
calling the new encryption function rather than the old one.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-33-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|