summaryrefslogtreecommitdiff
path: root/include/linux/timerqueue.h
diff options
context:
space:
mode:
authorMark Bloch <mbloch@nvidia.com>2026-04-28 08:10:17 +0300
committerJakub Kicinski <kuba@kernel.org>2026-04-29 17:46:28 -0700
commit6a92fe1956d285dd8d454e2b7ef49d0bae81bcbc (patch)
tree35212980bdff515bb8b54dc178f482d19dd14885 /include/linux/timerqueue.h
parent2a110ee54e8911aa6f66baec52252ce4431afe91 (diff)
net/mlx5: E-Switch, fix deadlock between devlink lock and esw->wq
mlx5_eswitch_cleanup() calls destroy_workqueue() while holding the devlink lock through mlx5_uninit_one(). E-Switch workqueue workers also need the devlink lock, but previously took it before checking whether their work item was stale. This can deadlock when cleanup waits for a worker that is blocked on the same devlink lock. Mode changes have the same ordering hazard: the mode-change path holds devlink lock while tearing down the current mode, and old work may still be pending on the E-Switch workqueue. Fix this by making esw_wq_handler() check the generation counter before attempting to take devlink lock. The worker uses devl_trylock(); if the lock is busy and the work is still current, it sleeps on an E-Switch wait queue with a short timeout. Invalidation increments the generation counter and wakes the wait queue, so stale workers exit without spinning or blocking cleanup. Invalidate work at the earliest safe operation boundary. Cleanup invalidates before destroy_workqueue(), and QoS cleanup runs after the workqueue is destroyed. Mode teardown unregisters the work-producing notifiers first, then invalidates the queue before tearing down FDB/QoS/rate-node state. This prevents new notifier work from capturing the new generation while still making old work stale before expensive teardown starts. mlx5_devlink_eswitch_mode_set() now relies on mlx5_eswitch_disable_locked() for the mode-change invalidation instead of incrementing the generation after disable. mlx5_eswitch_disable() gets the same coverage. SR-IOV enable/disable paths invalidate before VF state changes so work against the old VF count or mode is discarded. Remove the conditional generation increment in mlx5_eswitch_event_handler_unregister(); mlx5_eswitch_disable_locked() now handles it unconditionally after the relevant notifiers are unregistered. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260428051018.219093-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'include/linux/timerqueue.h')
0 files changed, 0 insertions, 0 deletions