diff options
| author | Simon Schippers <simon.schippers@tu-dortmund.de> | 2026-05-10 17:15:29 +0200 |
|---|---|---|
| committer | Jakub Kicinski <kuba@kernel.org> | 2026-05-13 17:52:55 -0700 |
| commit | 1d6e569b7d0c0b2736636749e4be0a27f3cefcb3 (patch) | |
| tree | a4a35048f0e03eec9b69ff8d08b1a0881a03c188 /scripts/const_structs.checkpatch | |
| parent | fba362c17d9d9211fc51f272156bb84fc23bdf98 (diff) | |
tun/tap & vhost-net: avoid ptr_ring tail-drop when a qdisc is present
This commit prevents tail-drop when a qdisc is present and the ptr_ring
becomes full. Once the ring reaches capacity after a produce attempt,
the netdev queue is stopped instead of dropping subsequent packets.
If no qdisc is present, the previous tail-drop behavior is preserved.
If producing an entry fails anyway due to a race, tun_net_xmit() drops
the packet. Such races are expected because LLTX is enabled and the
transmit path operates without the usual locking.
The __tun_wake_queue() function of the consumer races with the producer
for waking/stopping the netdev queue, which could result in a stalled
queue. Therefore, an smp_mb__after_atomic() is introduced that pairs
with the smp_mb() of the consumer. It follows the principle of store
buffering described in tools/memory-model/Documentation/recipes.txt:
- The producer in tun_net_xmit() first sets __QUEUE_STATE_DRV_XOFF,
followed by an smp_mb__after_atomic() (= smp_mb()), and then reads the
ring with __ptr_ring_check_produce().
- The consumer in __tun_wake_queue() first writes zero to the ring in
__ptr_ring_consume(), followed by an smp_mb(), and then reads the queue
status with netif_tx_queue_stopped().
=> Following the aforementioned principle, it is impossible for the
producer to see a full ring (and therefore not wake the queue on the
re-check) while the consumer simultaneously fails to see a stopped
queue (and therefore also does not wake it).
Benchmarks:
The benchmarks show a slight regression in raw transmission performance
when using two sending threads. Packet loss also occurs only in the
two-thread sending case; no packet loss was observed with a single
sending thread.
Test setup:
AMD Ryzen 5 5600X at 4.3 GHz, 3200 MHz RAM, isolated QEMU threads;
Average over 50 runs @ 100,000,000 packets. SRSO and spectre v2
mitigations disabled.
Note for tap+vhost-net:
XDP drop program active in VM -> ~2.5x faster; slower for tap due to
more syscalls (high utilization of entry_SYSRETQ_unsafe_stack in perf)
+--------------------------+--------------+----------------+----------+
| 1 thread | Stock | Patched with | diff |
| sending | | fq_codel qdisc | |
+------------+-------------+--------------+----------------+----------+
| TAP | Received | 1.132 Mpps | 1.123 Mpps | -0.8% |
| +-------------+--------------+----------------+----------+
| | Lost/s | 3.765 Mpps | 0 pps | |
+------------+-------------+--------------+----------------+----------+
| TAP | Received | 3.857 Mpps | 3.901 Mpps | +1.1% |
| +-------------+--------------+----------------+----------+
| +vhost-net | Lost/s | 0.802 Mpps | 0 pps | |
+------------+-------------+--------------+----------------+----------+
+--------------------------+--------------+----------------+----------+
| 2 threads | Stock | Patched with | diff |
| sending | | fq_codel qdisc | |
+------------+-------------+--------------+----------------+----------+
| TAP | Received | 1.115 Mpps | 1.081 Mpps | -3.0% |
| +-------------+--------------+----------------+----------+
| | Lost/s | 8.490 Mpps | 391 pps | |
+------------+-------------+--------------+----------------+----------+
| TAP | Received | 3.664 Mpps | 3.555 Mpps | -3.0% |
| +-------------+--------------+----------------+----------+
| +vhost-net | Lost/s | 5.330 Mpps | 938 pps | |
+------------+-------------+--------------+----------------+----------+
Co-developed-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Tim Gebauer <tim.gebauer@tu-dortmund.de>
Signed-off-by: Simon Schippers <simon.schippers@tu-dortmund.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20260510151529.43895-5-simon.schippers@tu-dortmund.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'scripts/const_structs.checkpatch')
0 files changed, 0 insertions, 0 deletions
