diff options
| author | Kuniyuki Iwashima <kuniyu@google.com> | 2026-05-02 03:13:08 +0000 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2026-05-05 17:47:06 -0700 |
| commit | 72d3b9a4c2b137b32fdf5342699d16229e2ac75e (patch) | |
| tree | 37f893d0c7c177e90760cc0a0f51cfb9804bd58a /include/linux/platform_data | |
| parent | 1ae552c7b6658c23fba8e964e687785297078880 (diff) | |
udp_tunnel: Remove synchronize_rcu() in udp_tunnel_sock_release().
Commit 3cf7203ca620 ("net/tunnel: wait until all sk_user_data
reader finish before releasing the sock") added synchronize_rcu()
in udp_tunnel_sock_release().
This was intended to protect the fast path of a dying vxlan device
from dereferencing vxlan_sock->sock->sk after sock_orphan() has set
sock->sk to NULL.
However, vxlan does not need to access struct socket itself
in the fast path; it only reads struct sock, and struct socket
is only used for tunnel setup and teardown.
This applies to all other UDP tunnel users, and they have been
converted to access struct sock directly.
In addition, each device-specific struct used in their fast paths
is freed after one RCU grace period. Since this occurs after
udp_tunnel_sock_release(), the struct is guaranteed to be freed
after struct udp_sock.
Therefore, synchronize_rcu() in udp_tunnel_sock_release() is
now redundant.
Let's remove it.
Tested:
A script creating/upping vxlan devices in 4000 netns runs 10x
faster with this change. We can see the same improvement with
other UDP tunnel devices as well.
$ cat vxlan.sh
for i in `seq 1 40`
do
(for j in `seq 1 100` ; do
unshare -n bash -c "ip link add vxlan0 type vxlan id 100 local 127.0.0.1 dstport 4789 && ip link set vxlan0 up";
done) &
done
wait
With bpftrace, we can see vxlan_stop() is significantly faster.
bpftrace -e '
kprobe:vxlan_stop {
@start[tid] = nsecs;
}
kretprobe:vxlan_stop /@start[tid]/ {
@duration_us = hist((nsecs - @start[tid]) / 1000);
delete(@start[tid]);
}
END {
printf("\nExecution time of vxlan_stop (us):\n");
}'
Before:
# time ./vxlan.sh // without bpftrace
real 0m50.615s
user 0m8.171s
sys 1m45.101s
@duration_us:
[4K, 8K) 1266 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[8K, 16K) 1957 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16K, 32K) 764 |@@@@@@@@@@@@@@@@@@@@ |
[32K, 64K) 6 | |
[64K, 128K) 4 | |
[128K, 256K) 3 | |
After:
# time ./vxlan.sh // without bpftrace
real 0m5.247s
user 0m7.956s
sys 1m47.404s
@duration_us:
[16, 32) 3411 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[32, 64) 383 |@@@@@ |
[64, 128) 107 |@ |
[128, 256) 79 |@ |
[256, 512) 16 | |
[512, 1K) 2 | |
[1K, 2K) 2 | |
Next step is to remove another synchronize_net() in vxlan_stop()
and variants in other devices.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260502031401.3557229-16-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'include/linux/platform_data')
0 files changed, 0 insertions, 0 deletions
