linux.git - Linux kernel source tree

diff options

author	Tejun Heo <tj@kernel.org>	2026-04-10 07:54:06 -1000
committer	Tejun Heo <tj@kernel.org>	2026-04-10 07:54:06 -1000
commit	d1d3c1c6ae3691869be9d94730d6e5325aaae8c6 (patch)
tree	cb57e9b4b9dedb5fdb92bde8751c24e5ca7b9366 /include/net/phy/git@git.tavy.me:linux.git
parent	2193af26a149acfb7a66f49397665640c2a60d8c (diff)

sched_ext: Add verifier-time kfunc context filter

Move enforcement of SCX context-sensitive kfunc restrictions from per-task runtime kf_mask checks to BPF verifier-time filtering, using the BPF core's struct_ops context information. A shared .filter callback is attached to each context-sensitive BTF set and consults a per-op allow table (scx_kf_allow_flags[]) indexed by SCX ops member offset. Disallowed calls are now rejected at program load time instead of at runtime. The old model split reachability across two places: each SCX_CALL_OP*() set bits naming its op context, and each kfunc's scx_kf_allowed() check OR'd together the bits it accepted. A kfunc was callable when those two masks overlapped. The new model transposes the result to the caller side - each op's allow flags directly list the kfunc groups it may call. The old bit assignments were: Call-site bits: ops.select_cpu = ENQUEUE | SELECT_CPU ops.enqueue = ENQUEUE ops.dispatch = DISPATCH ops.cpu_release = CPU_RELEASE Kfunc-group accepted bits: enqueue group = ENQUEUE | DISPATCH select_cpu group = SELECT_CPU | ENQUEUE dispatch group = DISPATCH cpu_release group = CPU_RELEASE Intersecting them yields the reachability now expressed directly by scx_kf_allow_flags[]: ops.select_cpu -> SELECT_CPU | ENQUEUE ops.enqueue -> SELECT_CPU | ENQUEUE ops.dispatch -> ENQUEUE | DISPATCH ops.cpu_release -> CPU_RELEASE Unlocked ops carried no kf_mask bits and reached only unlocked kfuncs; that maps directly to UNLOCKED in the new table. Equivalence was checked by walking every (op, kfunc-group) combination across SCX ops, SYSCALL, and non-SCX struct_ops callers against the old scx_kf_allowed() runtime checks. With two intended exceptions (see below), all combinations reach the same verdict; disallowed calls are now caught at load time instead of firing scx_error() at runtime. scx_bpf_dsq_move_set_slice() and scx_bpf_dsq_move_set_vtime() are exceptions: they have no runtime check at all, but the new filter rejects them from ops outside dispatch/unlocked. The affected cases are nonsensical - the values these setters store are only read by scx_bpf_dsq_move{,_vtime}(), which is itself restricted to dispatch/unlocked, so a setter call from anywhere else was already dead code. Runtime scx_kf_mask enforcement is left in place by this patch and removed in a follow-up. Original-patch-by: Juntong Deng <juntong.deng@outlook.com> Original-patch-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Andrea Righi <arighi@nvidia.com>

Diffstat (limited to 'include/net/phy/git@git.tavy.me:linux.git')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: