| Age | Commit message (Collapse) | Author |
|
In the group_has_spare case, the function creates a temporary cpumask
to just calculate weight of (p->cpus_ptr & sched_group_span(local)).
We've got a dedicated helper for it.
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Fernand Sieber <sieberf@amazon.com>
Link: https://patch.msgid.link/20251207034247.402926-1-yury.norov@gmail.com
|
|
Use for_each_cpu_and() and drop some housekeeping code.
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Phil Auld <pauld@redhat.com>
Link: https://patch.msgid.link/20251207033037.399608-1-yury.norov@gmail.com
|
|
cpumask_empty() call is O(N) and useless because the previous
cpumask_and() returns false for empty 'cpus'. Drop it.
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://patch.msgid.link/20251207040543.407695-1-yury.norov@gmail.com
|
|
BPF_MAP_TYPE_INSN_ARRAY maps store instruction pointers in their
ips array, not string data. The map_direct_value_addr callback for
this map type returns the address of the ips array, which is not
suitable for use as a constant string argument.
When a BPF program passes a pointer to an insn_array map value as
ARG_PTR_TO_CONST_STR (e.g., to bpf_snprintf), the verifier's
null-termination check in check_reg_const_str() operates on the
wrong memory region, and at runtime bpf_bprintf_prepare() can read
out of bounds searching for a null terminator.
Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str() since this
map type is not designed to hold string data.
Reported-by: syzbot+2c29addf92581b410079@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2c29addf92581b410079
Tested-by: syzbot+2c29addf92581b410079@syzkaller.appspotmail.com
Fixes: 493d9e0d6083 ("bpf, x86: add support for indirect jumps")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Acked-by: Anton Protopopov <a.s.protopopov@gmail.com>
Link: https://lore.kernel.org/r/20260107021037.289644-1-kartikey406@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The cgrp_ancestor_storage has two drawbacks:
- it's not guaranteed that the member immediately follows struct cgrp in
cgroup_root (root cgroup's ancestors[0] might thus point to a padding
and not in cgrp_ancestor_storage proper),
- this idiom raises warnings with -Wflex-array-member-not-at-end.
Instead of relying on the auxiliary member in cgroup_root, define the
0-th level ancestor inside struct cgroup (needed for static allocation
of cgrp_dfl_root), deeper cgroups would allocate flexible
_low_ancestors[]. Unionized alias through ancestors[] will
transparently join the two ranges.
The above change would still leave the flexible array at the end of
struct cgroup inside cgroup_root, so move cgrp also towards the end of
cgroup_root to resolve the -Wflex-array-member-not-at-end.
Link: https://lore.kernel.org/r/5fb74444-2fbb-476e-b1bf-3f3e279d0ced@embeddedor.com/
Reported-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Closes: https://lore.kernel.org/r/b3eb050d-9451-4b60-b06c-ace7dab57497@embeddedor.com/
Cc: David Laight <david.laight.linux@gmail.com>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
With IA-64 now gone, there are no users of the dma_mark_clean hook,
so we can retire it for good.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/c004927f01962726ff1dcf94d1b4efff84db805a.1767727673.git.robin.murphy@arm.com
|
|
The ftrace_dump_on_oops string is not used outside of trace.c so
make it static to avoid the export warning from sparse:
kernel/trace/trace.c:141:6: warning: symbol 'ftrace_dump_on_oops' was not declared. Should it be static?
Fixes: dd293df6395a2 ("tracing: Move trace sysctls into trace.c")
Link: https://patch.msgid.link/20260106231054.84270-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
A bug was reported about an infinite recursion caused by tracing the rcu
events with the kernel stack trace trigger enabled. The stack trace code
called back into RCU which then called the stack trace again.
Expand the ftrace recursion protection to add a set of bits to protect
events from recursion. Each bit represents the context that the event is
in (normal, softirq, interrupt and NMI).
Have the stack trace code use the interrupt context to protect against
recursion.
Note, the bug showed an issue in both the RCU code as well as the tracing
stacktrace code. This only handles the tracing stack trace side of the
bug. The RCU fix will be handled separately.
Link: https://lore.kernel.org/all/20260102122807.7025fc87@gandalf.local.home/
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Link: https://patch.msgid.link/20260105203141.515cd49f@gandalf.local.home
Reported-by: Yao Kai <yaokai34@huawei.com>
Tested-by: Yao Kai <yaokai34@huawei.com>
Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in __rcu_read_unlock()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
When user resize all trace ring buffer through file 'buffer_size_kb',
then in ring_buffer_resize(), kernel allocates buffer pages for each
cpu in a loop.
If the kernel preemption model is PREEMPT_NONE and there are many cpus
and there are many buffer pages to be freed, it may not give up cpu
for a long time and finally cause a softlockup.
To avoid it, call cond_resched() after each cpu buffer free as Commit
f6bd2c92488c ("ring-buffer: Avoid softlockup in ring_buffer_resize()")
does.
Detailed call trace as follow:
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 24-....: (14837 ticks this GP) idle=521c/1/0x4000000000000000 softirq=230597/230597 fqs=5329
rcu: (t=15004 jiffies g=26003221 q=211022 ncpus=96)
CPU: 24 UID: 0 PID: 11253 Comm: bash Kdump: loaded Tainted: G EL 6.18.2+ #278 NONE
pc : arch_local_irq_restore+0x8/0x20
arch_local_irq_restore+0x8/0x20 (P)
free_frozen_page_commit+0x28c/0x3b0
__free_frozen_pages+0x1c0/0x678
___free_pages+0xc0/0xe0
free_pages+0x3c/0x50
ring_buffer_resize.part.0+0x6a8/0x880
ring_buffer_resize+0x3c/0x58
__tracing_resize_ring_buffer.part.0+0x34/0xd8
tracing_resize_ring_buffer+0x8c/0xd0
tracing_entries_write+0x74/0xd8
vfs_write+0xcc/0x288
ksys_write+0x74/0x118
__arm64_sys_write+0x24/0x38
Cc: <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251228065008.2396573-1-mawupeng1@huawei.com
Signed-off-by: Wupeng Ma <mawupeng1@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
soft_mode is not read in the enable case, so drop the assignment.
Drop also the comment text that refers to the assignment and realign
the comment.
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20251226110531.4129794-1-Julia.Lawall@inria.fr
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
For use the init_srcu_struct*() to initialized srcu structure,
the srcu structure's->srcu_sup and sda use GFP_KERNEL flags to
allocate memory. similarly, if set SRCU_SIZING_INIT, the
srcu_sup's->node can still use GFP_KERNEL flags to allocate
memory, not need to use GFP_ATOMIC flags all the time.
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|
|
Commit 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in
__rcu_read_unlock()") removes the recursion-protection code from
__rcu_read_unlock(). Therefore, we could invoke the deadloop in
raise_softirq_irqoff() with ftrace enabled as follows:
WARNING: CPU: 0 PID: 0 at kernel/trace/trace.c:3021 __ftrace_trace_stack.constprop.0+0x172/0x180
Modules linked in: my_irq_work(O)
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G O 6.18.0-rc7-dirty #23 PREEMPT(full)
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:__ftrace_trace_stack.constprop.0+0x172/0x180
RSP: 0018:ffffc900000034a8 EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000000
RDX: 0000000000000003 RSI: ffffffff826d7b87 RDI: ffffffff826e9329
RBP: 0000000000090009 R08: 0000000000000005 R09: ffffffff82afbc4c
R10: 0000000000000008 R11: 0000000000011d7a R12: 0000000000000000
R13: ffff888003874100 R14: 0000000000000003 R15: ffff8880038c1054
FS: 0000000000000000(0000) GS:ffff8880fa8ea000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b31fa7f540 CR3: 00000000078f4005 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<IRQ>
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
unwind_next_frame+0x203/0x9b0
__unwind_start+0x15d/0x1c0
arch_stack_walk+0x62/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
raise_softirq_irqoff+0x6e/0xa0
rcu_read_unlock_special+0xb1/0x160
__is_insn_slot_addr+0x54/0x70
kernel_text_address+0x48/0xc0
__kernel_text_address+0xd/0x40
unwind_get_return_address+0x1e/0x40
arch_stack_walk+0x9c/0xf0
stack_trace_save+0x48/0x70
__ftrace_trace_stack.constprop.0+0x144/0x180
trace_buffer_unlock_commit_regs+0x6d/0x220
trace_event_buffer_commit+0x5c/0x260
trace_event_raw_event_softirq+0x47/0x80
__raise_softirq_irqoff+0x61/0x80
__flush_smp_call_function_queue+0x115/0x420
__sysvec_call_function_single+0x17/0xb0
sysvec_call_function_single+0x8c/0xc0
</IRQ>
Commit b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
fixed the infinite loop in rcu_read_unlock_special() for IRQ work by
setting a flag before calling irq_work_queue_on(). We fix this issue by
setting the same flag before calling raise_softirq_irqoff() and rename the
flag to defer_qs_pending for more common.
Fixes: 5f5fa7ea89dc ("rcu: Don't use negative nesting depth in __rcu_read_unlock()")
Reported-by: Tengda Wu <wutengda2@huawei.com>
Signed-off-by: Yao Kai <yaokai34@huawei.com>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|
|
Lack of parentheses causes the ->exp_current() function, for example,
srcu_expedite_current(), to be called only once in four billion times
instead of the intended once in 256 times. This commit therefore adds
the needed parentheses.
Reported-by: Chris Mason <clm@meta.com>
Reported-by: Joel Fernandes <joelagnelf@nvidia.com>
Fixes: 950063c6e897 ("rcutorture: Test srcu_expedite_current()")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|
|
If an expedited RCU CPU stall ends just at the stall-warning timeout,
the current code will print an expedited stall-warning message, but one
that doesn't identify any CPUs or tasks causing the stall. This is most
likely to happen for short-timeout stalls, for example, the 20-millisecond
timeouts that are sometimes used for small embedded devices. Needless to
say, these semi-empty stall-warning messages can be rather confusing.
One option would be to suppress the stall-warning message entirely in
this case, but the near-miss information can be quite valuable.
Detect this race condition and emits a "INFO: Expedited stall ended
before state dump start" message to clarify matters.
[boqun: Apply feedback from Borislav]
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|
|
percpu_cgroup_storage maps
Introduce BPF_F_ALL_CPUS flag support for percpu_cgroup_storage maps to
allow updating values for all CPUs with a single value for update_elem
API.
Introduce BPF_F_CPU flag support for percpu_cgroup_storage maps to
allow:
* update value for specified CPU for update_elem API.
* lookup value for specified CPU for lookup_elem API.
The BPF_F_CPU flag is passed via map_flags along with embedded cpu info.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260107022022.12843-6-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Copy map value using 'copy_map_value_long()'. It's to keep consistent
style with the way of other percpu maps.
No functional change intended.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260107022022.12843-5-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
lru_percpu_hash maps
Introduce BPF_F_ALL_CPUS flag support for percpu_hash and lru_percpu_hash
maps to allow updating values for all CPUs with a single value for both
update_elem and update_batch APIs.
Introduce BPF_F_CPU flag support for percpu_hash and lru_percpu_hash
maps to allow:
* update value for specified CPU for both update_elem and update_batch
APIs.
* lookup value for specified CPU for both lookup_elem and lookup_batch
APIs.
The BPF_F_CPU flag is passed via:
* map_flags along with embedded cpu info.
* elem_flags along with embedded cpu info.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260107022022.12843-4-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Introduce support for the BPF_F_ALL_CPUS flag in percpu_array maps to
allow updating values for all CPUs with a single value for both
update_elem and update_batch APIs.
Introduce support for the BPF_F_CPU flag in percpu_array maps to allow:
* update value for specified CPU for both update_elem and update_batch
APIs.
* lookup value for specified CPU for both lookup_elem and lookup_batch
APIs.
The BPF_F_CPU flag is passed via:
* map_flags of lookup_elem and update_elem APIs along with embedded cpu
info.
* elem_flags of lookup_batch and update_batch APIs along with embedded
cpu info.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260107022022.12843-3-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags and check them for
following APIs:
* 'map_lookup_elem()'
* 'map_update_elem()'
* 'generic_map_lookup_batch()'
* 'generic_map_update_batch()'
And, get the correct value size for these APIs.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
Link: https://lore.kernel.org/r/20260107022022.12843-2-leon.hwang@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Leverage ARCH_HAS_DMA_MAP_DIRECT config option for coherent allocations as
well. This will bypass DMA ops for memory allocations that have been
pre-mapped.
Always set device bus_dma_limit when memory is pre-mapped. In some
architectures, like PowerPC, pmemory can be converted to regular memory via
daxctl command. This will gate the coherent allocations to pre-mapped RAM
only, by dma_coherent_ok().
Signed-off-by: Gaurav Batra <gbatra@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251107161105.85999-1-gbatra@linux.ibm.com
|
|
The function set_security_override_from_ctx() has no in-tree callers
since 6.14. Remove it.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak, merge fuzz]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
The bpf_arena_*_pages() kfuncs can be called from sleepable contexts,
but the verifier still prevents BPF programs from calling them while
holding a spinlock. Amend the verifier to allow for BPF programs
calling arena page management functions while holding a lock.
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260106-arena-under-lock-v2-2-378e9eab3066@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The in_sleepable_context() function is used to specialize the BPF code
in do_misc_fixups(). With the addition of nonsleepable arena kfuncs,
there are kfuncs whose specialization depends on whether we are
holding a lock. We should use the nonsleepable version while
holding a lock and the sleepable one when not.
Add a check for active_locks to account for locking when specializing
arena kfuncs.
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260106-arena-under-lock-v2-1-378e9eab3066@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Replace deprecated kmap_atomic() with kmap_local_page().
Signed-off-by: Keke Ming <ming.jvle@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Link: https://patch.msgid.link/20260103084243.195125-6-ming.jvle@gmail.com
|
|
Remove SYSCTL_INT_CONV_CUSTOM and replace it with proc_int_conv. This
converter function expects a negp argument as it can take on negative
values. Update all jiffies converters to use explicit function calls.
Remove SYSCTL_CONV_IDENTITY as it is no longer used.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
Replace SYSCTL_USER_TO_KERN_INT_CONV and SYSCTL_KERN_TO_USER_INT_CONV
macros with function implementing the same logic.This makes debugging
easier and aligns with the functions preference described in
coding-style.rst. Update all jiffies converters to use explicit function
implementations instead of macro-generated versions.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
When crypto_alloc_acomp() fails, it returns an ERR_PTR value, not NULL.
The cleanup code in save_compressed_image() and load_compressed_image()
unconditionally calls crypto_free_acomp() without checking for ERR_PTR,
which causes crypto_acomp_tfm() to dereference an invalid pointer and
crash the kernel.
This can be triggered when the compression algorithm is unavailable
(e.g., CONFIG_CRYPTO_LZO not enabled).
Fix by adding IS_ERR_OR_NULL() checks before calling crypto_free_acomp()
and acomp_request_free(), similar to the existing kthread_stop() check.
Fixes: b03d542c3c95 ("PM: hibernate: Use crypto_acomp interface")
Signed-off-by: Malaya Kumar Rout <mrout@redhat.com>
Cc: 6.15+ <stable@vger.kernel.org> # 6.15+
[ rjw: Added 2 empty code lines ]
Link: https://patch.msgid.link/20251230115613.64080-1-mrout@redhat.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
This demonstrates a larger conversion to use Clang's context
analysis. The benefit is additional static checking of locking rules,
along with better documentation.
Notably, kernel/sched contains sufficiently complex synchronization
patterns, and application to core.c & fair.c demonstrates that the
latest Clang version has become powerful enough to start applying this
to more complex subsystems (with some modest annotations and changes).
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251219154418.3592607-37-elver@google.com
|
|
With Sparse support gone, Clang is a bit more strict and warns:
./include/linux/console.h:492:50: error: use of undeclared identifier 'console_mutex'
492 | extern void console_list_unlock(void) __releases(console_mutex);
Since it does not make sense to make console_mutex itself global, move
the annotation to printk.c. Context analysis remains disabled for
printk.c.
This is needed to enable context analysis for modules that include
<linux/console.h>.
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251219154418.3592607-34-elver@google.com
|
|
Enable context analysis for the KCSAN subsystem.
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251219154418.3592607-31-elver@google.com
|
|
Enable context analysis for the KCOV subsystem.
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251219154418.3592607-30-elver@google.com
|
|
As discussed in [1], removing __cond_lock() will improve the readability
of trylock code. Now that Sparse context tracking support has been
removed, we can also remove __cond_lock().
Change existing APIs to either drop __cond_lock() completely, or make
use of the __cond_acquires() function attribute instead.
In particular, spinlock and rwlock implementations required switching
over to inline helpers rather than statement-expressions for their
trylock_* variants.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250207082832.GU7145@noisy.programming.kicks-ass.net/ [1]
Link: https://patch.msgid.link/20251219154418.3592607-25-elver@google.com
|
|
This commit is making sure that all the functions that are part of the
API are documented.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
Replace the SYSCTL_USER_TO_KERN_UINT_CONV and SYSCTL_UINT_CONV_CUSTOM
macros with functions with the same logic. This makes debugging easier
and aligns with the functions preference described in coding-style.rst.
Update the only user of this API: pipe.c.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
Specify that the range check is only when assigning kernel variable
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
Ensure an error if prco_douintvec_conv is erroneously called in a system
with CONFIG_PROC_SYSCTL=n
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
Remove superfluous forward declarations of ctl_table from header files
where they are no longer needed. These declarations were left behind
after sysctl code refactoring and cleanup.
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
Add kernel-doc documentation for the proc_dointvec_conv function to
describe its parameters and return value.
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Joel Granados <joel.granados@kernel.org>
|
|
|
|
With the change to hrtimer_try_to_cancel() in
perf_swevent_cancel_hrtimer() it appears possible for the hrtimer to
still be active by the time the event gets freed.
Make sure the event does a full hrtimer_cancel() on the free path by
installing a perf_event::destroy handler.
Fixes: eb3182ef0405 ("perf/core: Fix system hang caused by cpu-clock usage")
Reported-by: CyberUnicorns <a101e_iotvul@163.com>
Tested-by: CyberUnicorns <a101e_iotvul@163.com>
Debugged-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
* rcu-torture.20260104a:
rcutorture: Add --kill-previous option to terminate previous kvm.sh runs
rcutorture: Prevent concurrent kvm.sh runs on same source tree
torture: Include commit discription in testid.txt
torture: Make config2csv.sh properly handle comments in .boot files
torture: Make kvm-series.sh give run numbers and totals
torture: Make kvm-series.sh give build numbers and totals
torture: Parallelize kvm-series.sh guest-OS execution
rcutorture: Add context checks to rcu_torture_timer()
|
|
The __opt annotation was originally introduced specifically for
buffer/size argument pairs in bpf_dynptr_slice() and
bpf_dynptr_slice_rdwr(), allowing the buffer pointer to be NULL while
still validating the size as a constant. The __nullable annotation
serves the same purpose but is more general and is already used
throughout the BPF subsystem for raw tracepoints, struct_ops, and other
kfuncs.
This patch unifies the two annotations by replacing __opt with
__nullable. The key change is in the verifier's
get_kfunc_ptr_arg_type() function, where mem/size pair detection is now
performed before the nullable check. This ensures that buffer/size
pairs are correctly classified as KF_ARG_PTR_TO_MEM_SIZE even when the
buffer is nullable, while adding an !arg_mem_size condition to the
nullable check prevents interference with mem/size pair handling.
When processing KF_ARG_PTR_TO_MEM_SIZE arguments, the verifier now uses
is_kfunc_arg_nullable() instead of the removed is_kfunc_arg_optional()
to determine whether to skip size validation for NULL buffers.
This is the first documentation added for the __nullable annotation,
which has been in use since it was introduced but was previously
undocumented.
No functional changes to verifier behavior - nullable buffer/size pairs
continue to work exactly as before.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260102221513.1961781-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When arena allocations were converted from bpf_map_alloc_pages() to
kmalloc_nolock() to support non-sleepable contexts, memcg accounting was
inadvertently lost. This commit restores proper memory accounting for
all arena-related allocations.
All arena related allocations are accounted into memcg of the process
that created bpf_arena.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260102200230.25168-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Introduce bpf_map_memcg_enter() and bpf_map_memcg_exit() helpers to
reduce code duplication in memcg context management.
bpf_map_memcg_enter() gets the memcg from the map, sets it as active,
and returns both the previous and the now active memcg.
bpf_map_memcg_exit() restores the previous active memcg and releases the
reference obtained during enter.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260102200230.25168-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a recent regression that affects system suspend testing
at the 'core' level (Rafael Wysocki)"
* tag 'pm-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: Fix suspend_test() at the TEST_CORE level
|
|
Now that KF_TRUSTED_ARGS is the default for all kfuncs, remove the
explicit KF_TRUSTED_ARGS flag from all kfunc definitions and remove the
flag itself.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260102180038.2708325-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Change the verifier to make trusted args the default requirement for
all kfuncs by removing is_kfunc_trusted_args() assuming it be to always
return true.
This works because:
1. Context pointers (xdp_md, __sk_buff, etc.) are handled through their
own KF_ARG_PTR_TO_CTX case label and bypass the trusted check
2. Struct_ops callback arguments are already marked as PTR_TRUSTED during
initialization and pass is_trusted_reg()
3. KF_RCU kfuncs are handled separately via is_kfunc_rcu() checks at
call sites (always checked with || alongside is_kfunc_trusted_args)
This simple change makes all kfuncs require trusted args by default
while maintaining correct behavior for all existing special cases.
Note: This change means kfuncs that previously accepted NULL pointers
without KF_TRUSTED_ARGS will now reject NULL at verification time.
Several netfilter kfuncs are affected: bpf_xdp_ct_lookup(),
bpf_skb_ct_lookup(), bpf_xdp_ct_alloc(), and bpf_skb_ct_alloc() all
accept NULL for their bpf_tuple and opts parameters internally (checked
in __bpf_nf_ct_lookup), but after this change the verifier rejects NULL
before the kfunc is even called. This is acceptable because these kfuncs
don't work with NULL parameters in their proper usage. Now they will be
rejected rather than returning an error, which shouldn't make a
difference to BPF programs that were using these kfuncs properly.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260102180038.2708325-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
If a driver is buggy and has 2 overlapping mappings but only
sets cache clean flag on the 1st one of them, we warn.
But if it only does it for the 2nd one, we don't.
Fix by tracking cache clean flag in the entry.
Message-ID: <0ffb3513d18614539c108b4548cdfbc64274a7d1.1767601130.git.mst@redhat.com>
Reviewed-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
|
|
This commit adds irq, NMI, and softirq context checks to the
rcu_torture_timer() function. Just because you are paranoid does not
mean that they are not out to get you... ;-)
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|
|
This commit adds a ->exp_current member to the tasks_tracing_ops structure
to test the rcu_tasks_trace_expedite_current() function.
[ paulmck: Apply kernel test robot feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
|