diff options
| author | Tejun Heo <tj@kernel.org> | 2026-03-13 09:43:22 -1000 |
|---|---|---|
| committer | Tejun Heo <tj@kernel.org> | 2026-03-13 09:43:22 -1000 |
| commit | 98d709cba3193f0bec54da4cd76ef499ea2f1ef7 (patch) | |
| tree | c402f9053b2c3bc28f7132a28b95b88ef12f2972 /include/linux/sched | |
| parent | b5b38761b45a6c7d91760d212fda8b46df8c5362 (diff) | |
sched_ext: Implement SCX_ENQ_IMMED
Add SCX_ENQ_IMMED enqueue flag for local DSQ insertions. Once a task is
dispatched with IMMED, it either gets on the CPU immediately and stays on it,
or gets reenqueued back to the BPF scheduler. It will never linger on a local
DSQ behind other tasks or on a CPU taken by a higher-priority class.
rq_is_open() uses rq->next_class to determine whether the rq is available,
and wakeup_preempt_scx() triggers reenqueue when a higher-priority class task
arrives. These capture all higher class preemptions. Combined with reenqueue
points in the dispatch path, all cases where an IMMED task would not execute
immediately are covered.
SCX_TASK_IMMED persists in p->scx.flags until the next fresh enqueue, so the
guarantee survives SAVE/RESTORE cycles. If preempted while running,
put_prev_task_scx() reenqueues through ops.enqueue() with
SCX_TASK_REENQ_PREEMPTED instead of silently placing the task back on the
local DSQ.
This enables tighter scheduling latency control by preventing tasks from
piling up on local DSQs. It also enables opportunistic CPU sharing across
sub-schedulers - without this, a sub-scheduler can stuff the local DSQ of a
shared CPU, making it difficult for others to use.
v2: - Rewrite is_curr_done() as rq_is_open() using rq->next_class and
implement wakeup_preempt_scx() to achieve complete coverage of all
cases where IMMED tasks could get stranded.
- Track IMMED persistently in p->scx.flags and reenqueue
preempted-while-running tasks through ops.enqueue().
- Bound deferred reenq cycles (SCX_REENQ_LOCAL_MAX_REPEAT).
- Misc renames, documentation.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Diffstat (limited to 'include/linux/sched')
| -rw-r--r-- | include/linux/sched/ext.h | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h index 60a4f65d0174..602dc83cab36 100644 --- a/include/linux/sched/ext.h +++ b/include/linux/sched/ext.h @@ -100,6 +100,7 @@ enum scx_ent_flags { SCX_TASK_RESET_RUNNABLE_AT = 1 << 2, /* runnable_at should be reset */ SCX_TASK_DEQD_FOR_SLEEP = 1 << 3, /* last dequeue was for SLEEP */ SCX_TASK_SUB_INIT = 1 << 4, /* task being initialized for a sub sched */ + SCX_TASK_IMMED = 1 << 5, /* task is on local DSQ with %SCX_ENQ_IMMED */ /* * Bits 8 and 9 are used to carry task state: @@ -125,6 +126,8 @@ enum scx_ent_flags { * * NONE not being reenqueued * KFUNC reenqueued by scx_bpf_dsq_reenq() and friends + * IMMED reenqueued due to failed ENQ_IMMED + * PREEMPTED preempted while running */ SCX_TASK_REENQ_REASON_SHIFT = 12, SCX_TASK_REENQ_REASON_BITS = 2, @@ -132,6 +135,8 @@ enum scx_ent_flags { SCX_TASK_REENQ_NONE = 0 << SCX_TASK_REENQ_REASON_SHIFT, SCX_TASK_REENQ_KFUNC = 1 << SCX_TASK_REENQ_REASON_SHIFT, + SCX_TASK_REENQ_IMMED = 2 << SCX_TASK_REENQ_REASON_SHIFT, + SCX_TASK_REENQ_PREEMPTED = 3 << SCX_TASK_REENQ_REASON_SHIFT, /* iteration cursor, not a task */ SCX_TASK_CURSOR = 1 << 31, |
