diff options
| author | David Carlier <devnexen@gmail.com> | 2026-03-20 07:26:45 +0000 |
|---|---|---|
| committer | Martin KaFai Lau <martin.lau@kernel.org> | 2026-03-24 15:17:20 -0700 |
| commit | 8ed82f807bb09d2c8455aaa665f2c6cb17bc6a19 (patch) | |
| tree | 27f7c58062916545bbba11173f7f6658d35e7d4f | |
| parent | 7f5b0a60a8b925db0cde62d87f1031dc04acbeb2 (diff) | |
bpf: Use RCU-safe iteration in dev_map_redirect_multi() SKB path
The DEVMAP_HASH branch in dev_map_redirect_multi() uses
hlist_for_each_entry_safe() to iterate hash buckets, but this function
runs under RCU protection (called from xdp_do_generic_redirect_map()
in softirq context). Concurrent writers (__dev_map_hash_update_elem,
dev_map_hash_delete_elem) modify the list using RCU primitives
(hlist_add_head_rcu, hlist_del_rcu).
hlist_for_each_entry_safe() performs plain pointer dereferences without
rcu_dereference(), missing the acquire barrier needed to pair with
writers' rcu_assign_pointer(). On weakly-ordered architectures (ARM64,
POWER), a reader can observe a partially-constructed node. It also
defeats CONFIG_PROVE_RCU lockdep validation and KCSAN data-race
detection.
Replace with hlist_for_each_entry_rcu() using rcu_read_lock_bh_held()
as the lockdep condition, consistent with the rcu_dereference_check()
used in the DEVMAP (non-hash) branch of the same functions. Also fix
the same incorrect lockdep_is_held(&dtab->index_lock) condition in
dev_map_enqueue_multi(), where the lock is not held either.
Fixes: e624d4ed4aa8 ("xdp: Extend xdp_redirect_map with broadcast support")
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260320072645.16731-1-devnexen@gmail.com
| -rw-r--r-- | kernel/bpf/devmap.c | 5 |
1 files changed, 2 insertions, 3 deletions
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 3d619d01088e..cc0a43ebab6b 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -665,7 +665,7 @@ int dev_map_enqueue_multi(struct xdp_frame *xdpf, struct net_device *dev_rx, for (i = 0; i < dtab->n_buckets; i++) { head = dev_map_index_hash(dtab, i); hlist_for_each_entry_rcu(dst, head, index_hlist, - lockdep_is_held(&dtab->index_lock)) { + rcu_read_lock_bh_held()) { if (!is_valid_dst(dst, xdpf)) continue; @@ -747,7 +747,6 @@ int dev_map_redirect_multi(struct net_device *dev, struct sk_buff *skb, struct bpf_dtab_netdev *dst, *last_dst = NULL; int excluded_devices[1+MAX_NEST_DEV]; struct hlist_head *head; - struct hlist_node *next; int num_excluded = 0; unsigned int i; int err; @@ -787,7 +786,7 @@ int dev_map_redirect_multi(struct net_device *dev, struct sk_buff *skb, } else { /* BPF_MAP_TYPE_DEVMAP_HASH */ for (i = 0; i < dtab->n_buckets; i++) { head = dev_map_index_hash(dtab, i); - hlist_for_each_entry_safe(dst, next, head, index_hlist) { + hlist_for_each_entry_rcu(dst, head, index_hlist, rcu_read_lock_bh_held()) { if (is_ifindex_excluded(excluded_devices, num_excluded, dst->dev->ifindex)) continue; |
