summaryrefslogtreecommitdiff
path: root/kernel
diff options
context:
space:
mode:
authorTejun Heo <tj@kernel.org>2026-03-22 10:33:08 -1000
committerTejun Heo <tj@kernel.org>2026-03-22 14:05:25 -1000
commit76edc2761ab8bd27fe4c4b8b2fb71baefc4a31e8 (patch)
treed980b826248deb574455d8304c60e69b8987e8d6 /kernel
parentf03ffe53ab6ffc798ed8291090cebf19c6e5fa3b (diff)
sched_ext: Use irq_work_queue_on() in schedule_deferred()
schedule_deferred() uses irq_work_queue() which always queues on the calling CPU. The deferred work can run from any CPU correctly, and the _locked() path already processes remote rqs from the calling CPU. However, when falling through to the irq_work path, queuing on the target CPU is preferable as the work can run sooner via IPI delivery rather than waiting for the calling CPU to re-enable IRQs. Currently, only reenqueue operations use this path - either BPF-initiated reenqueue targeting a remote rq, or IMMED reenqueue when the target CPU is busy running userspace (not in balance or wakeup, so the _locked() fast paths aren't available). Use irq_work_queue_on() to target the owning CPU. This improves IMMED reenqueue latency when tasks are dispatched to remote local DSQs. Testing on a 24-CPU AMD Ryzen 3900X with scx_qmap -I -F 50 (ALWAYS_ENQ_IMMED, every 50th enqueue forced to prev_cpu's local DSQ) under heavy mixed load (2x CPU oversubscription, yield and context-switch pressure, SCHED_FIFO bursts, periodic fork storms, mixed nice levels, C-states disabled), measuring local DSQ residence time (insert to remove) over 5 x 120s runs (~1.2M tasks per set): >128us outliers: 71 -> 39 (-45%) >256us outliers: 59 -> 36 (-39%) Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Diffstat (limited to 'kernel')
-rw-r--r--kernel/sched/ext.c14
1 files changed, 11 insertions, 3 deletions
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 2e7a1259bd7c..72a07eb050a3 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -1164,10 +1164,18 @@ static void deferred_irq_workfn(struct irq_work *irq_work)
static void schedule_deferred(struct rq *rq)
{
/*
- * Queue an irq work. They are executed on IRQ re-enable which may take
- * a bit longer than the scheduler hook in schedule_deferred_locked().
+ * This is the fallback when schedule_deferred_locked() can't use
+ * the cheaper balance callback or wakeup hook paths (the target
+ * CPU is not in balance or wakeup). Currently, this is primarily
+ * hit by reenqueue operations targeting a remote CPU.
+ *
+ * Queue on the target CPU. The deferred work can run from any CPU
+ * correctly - the _locked() path already processes remote rqs from
+ * the calling CPU - but targeting the owning CPU allows IPI delivery
+ * without waiting for the calling CPU to re-enable IRQs and is
+ * cheaper as the reenqueue runs locally.
*/
- irq_work_queue(&rq->scx.deferred_irq_work);
+ irq_work_queue_on(&rq->scx.deferred_irq_work, cpu_of(rq));
}
/**