<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/vhost/net.c, branch linux-6.16.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>vhost-net: flush batched before enabling notifications</title>
<updated>2025-10-02T11:48:37+00:00</updated>
<author>
<name>Jason Wang</name>
<email>jasowang@redhat.com</email>
</author>
<published>2025-09-17T06:30:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a63e7dcf6a5524569dfb1c666e8fc1b1bb79daa7'/>
<id>a63e7dcf6a5524569dfb1c666e8fc1b1bb79daa7</id>
<content type='text'>
commit e430451613c7a27beeadd00d707bcf7ceec6328e upstream.

Commit 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after
sendmsg") tries to defer the notification enabling by moving the logic
out of the loop after the vhost_tx_batch() when nothing new is spotted.
This caused unexpected side effects as the new logic is reused for
several other error conditions.

A previous patch reverted 8c2e6b26ffe2. Now, bring the performance
back up by flushing batched buffers before enabling notifications.

Reported-by: Jon Kohler &lt;jon@nutanix.com&gt;
Cc: stable@vger.kernel.org
Fixes: 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg")
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Message-Id: &lt;20250917063045.2042-3-jasowang@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e430451613c7a27beeadd00d707bcf7ceec6328e upstream.

Commit 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after
sendmsg") tries to defer the notification enabling by moving the logic
out of the loop after the vhost_tx_batch() when nothing new is spotted.
This caused unexpected side effects as the new logic is reused for
several other error conditions.

A previous patch reverted 8c2e6b26ffe2. Now, bring the performance
back up by flushing batched buffers before enabling notifications.

Reported-by: Jon Kohler &lt;jon@nutanix.com&gt;
Cc: stable@vger.kernel.org
Fixes: 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg")
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Message-Id: &lt;20250917063045.2042-3-jasowang@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "vhost/net: Defer TX queue re-enable until after sendmsg"</title>
<updated>2025-10-02T11:48:37+00:00</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2025-09-17T06:30:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7de587f87f37e59e42bf7aec587cf5d82c64806f'/>
<id>7de587f87f37e59e42bf7aec587cf5d82c64806f</id>
<content type='text'>
commit 4174152771bf0d014d58f7d7e148bb0c8830fe53 upstream.

This reverts commit 8c2e6b26ffe243be1e78f5a4bfb1a857d6e6f6d6. It tries
to defer the notification enabling by moving the logic out of the loop
after the vhost_tx_batch() when nothing new is spotted. This will
bring side effects as the new logic would be reused for several other
error conditions.

One example is the IOTLB: when there's an IOTLB miss, get_tx_bufs()
might return -EAGAIN and exit the loop and see there's still available
buffers, so it will queue the tx work again until userspace feed the
IOTLB entry correctly. This will slowdown the tx processing and
trigger the TX watchdog in the guest as reported in
https://lkml.org/lkml/2025/9/10/1596.

To fix, revert the change. A follow up patch will bring the performance
back in a safe way.

Reported-by: Jon Kohler &lt;jon@nutanix.com&gt;
Cc: stable@vger.kernel.org
Fixes: 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg")
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Message-Id: &lt;20250917063045.2042-2-jasowang@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 4174152771bf0d014d58f7d7e148bb0c8830fe53 upstream.

This reverts commit 8c2e6b26ffe243be1e78f5a4bfb1a857d6e6f6d6. It tries
to defer the notification enabling by moving the logic out of the loop
after the vhost_tx_batch() when nothing new is spotted. This will
bring side effects as the new logic would be reused for several other
error conditions.

One example is the IOTLB: when there's an IOTLB miss, get_tx_bufs()
might return -EAGAIN and exit the loop and see there's still available
buffers, so it will queue the tx work again until userspace feed the
IOTLB entry correctly. This will slowdown the tx processing and
trigger the TX watchdog in the guest as reported in
https://lkml.org/lkml/2025/9/10/1596.

To fix, revert the change. A follow up patch will bring the performance
back in a safe way.

Reported-by: Jon Kohler &lt;jon@nutanix.com&gt;
Cc: stable@vger.kernel.org
Fixes: 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg")
Signed-off-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Message-Id: &lt;20250917063045.2042-2-jasowang@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost/net: Protect ubufs with rcu read lock in vhost_net_ubuf_put()</title>
<updated>2025-09-04T14:55:32+00:00</updated>
<author>
<name>Nikolay Kuratov</name>
<email>kniv@yandex-team.ru</email>
</author>
<published>2025-08-05T13:09:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6b4abb5b7928049e8209f549b1da8a9619ce5511'/>
<id>6b4abb5b7928049e8209f549b1da8a9619ce5511</id>
<content type='text'>
commit dd54bcf86c91a4455b1f95cbc8e9ac91205f3193 upstream.

When operating on struct vhost_net_ubuf_ref, the following execution
sequence is theoretically possible:
CPU0 is finalizing DMA operation                   CPU1 is doing VHOST_NET_SET_BACKEND
                             // ubufs-&gt;refcount == 2
vhost_net_ubuf_put()                               vhost_net_ubuf_put_wait_and_free(oldubufs)
                                                     vhost_net_ubuf_put_and_wait()
                                                       vhost_net_ubuf_put()
                                                         int r = atomic_sub_return(1, &amp;ubufs-&gt;refcount);
                                                         // r = 1
int r = atomic_sub_return(1, &amp;ubufs-&gt;refcount);
// r = 0
                                                      wait_event(ubufs-&gt;wait, !atomic_read(&amp;ubufs-&gt;refcount));
                                                      // no wait occurs here because condition is already true
                                                    kfree(ubufs);
if (unlikely(!r))
  wake_up(&amp;ubufs-&gt;wait);  // use-after-free

This leads to use-after-free on ubufs access. This happens because CPU1
skips waiting for wake_up() when refcount is already zero.

To prevent that use a read-side RCU critical section in vhost_net_ubuf_put(),
as suggested by Hillf Danton. For this lock to take effect, free ubufs with
kfree_rcu().

Cc: stable@vger.kernel.org
Fixes: 0ad8b480d6ee9 ("vhost: fix ref cnt checking deadlock")
Reported-by: Andrey Ryabinin &lt;arbn@yandex-team.com&gt;
Suggested-by: Hillf Danton &lt;hdanton@sina.com&gt;
Signed-off-by: Nikolay Kuratov &lt;kniv@yandex-team.ru&gt;
Message-Id: &lt;20250805130917.727332-1-kniv@yandex-team.ru&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit dd54bcf86c91a4455b1f95cbc8e9ac91205f3193 upstream.

When operating on struct vhost_net_ubuf_ref, the following execution
sequence is theoretically possible:
CPU0 is finalizing DMA operation                   CPU1 is doing VHOST_NET_SET_BACKEND
                             // ubufs-&gt;refcount == 2
vhost_net_ubuf_put()                               vhost_net_ubuf_put_wait_and_free(oldubufs)
                                                     vhost_net_ubuf_put_and_wait()
                                                       vhost_net_ubuf_put()
                                                         int r = atomic_sub_return(1, &amp;ubufs-&gt;refcount);
                                                         // r = 1
int r = atomic_sub_return(1, &amp;ubufs-&gt;refcount);
// r = 0
                                                      wait_event(ubufs-&gt;wait, !atomic_read(&amp;ubufs-&gt;refcount));
                                                      // no wait occurs here because condition is already true
                                                    kfree(ubufs);
if (unlikely(!r))
  wake_up(&amp;ubufs-&gt;wait);  // use-after-free

This leads to use-after-free on ubufs access. This happens because CPU1
skips waiting for wake_up() when refcount is already zero.

To prevent that use a read-side RCU critical section in vhost_net_ubuf_put(),
as suggested by Hillf Danton. For this lock to take effect, free ubufs with
kfree_rcu().

Cc: stable@vger.kernel.org
Fixes: 0ad8b480d6ee9 ("vhost: fix ref cnt checking deadlock")
Reported-by: Andrey Ryabinin &lt;arbn@yandex-team.com&gt;
Suggested-by: Hillf Danton &lt;hdanton@sina.com&gt;
Signed-off-by: Nikolay Kuratov &lt;kniv@yandex-team.ru&gt;
Message-Id: &lt;20250805130917.727332-1-kniv@yandex-team.ru&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost/net: Defer TX queue re-enable until after sendmsg</title>
<updated>2025-05-06T01:18:41+00:00</updated>
<author>
<name>Jon Kohler</name>
<email>jon@nutanix.com</email>
</author>
<published>2025-05-01T02:04:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8c2e6b26ffe243be1e78f5a4bfb1a857d6e6f6d6'/>
<id>8c2e6b26ffe243be1e78f5a4bfb1a857d6e6f6d6</id>
<content type='text'>
In handle_tx_copy, TX batching processes packets below ~PAGE_SIZE and
batches up to 64 messages before calling sock-&gt;sendmsg.

Currently, when there are no more messages on the ring to dequeue,
handle_tx_copy re-enables kicks on the ring *before* firing off the
batch sendmsg. However, sock-&gt;sendmsg incurs a non-zero delay,
especially if it needs to wake up a thread (e.g., another vhost worker).

If the guest submits additional messages immediately after the last ring
check and disablement, it triggers an EPT_MISCONFIG vmexit to attempt to
kick the vhost worker. This may happen while the worker is still
processing the sendmsg, leading to wasteful exit(s).

This is particularly problematic for single-threaded guest submission
threads, as they must exit, wait for the exit to be processed
(potentially involving a TTWU), and then resume.

In scenarios like a constant stream of UDP messages, this results in a
sawtooth pattern where the submitter frequently vmexits, and the
vhost-net worker alternates between sleeping and waking.

A common solution is to configure vhost-net busy polling via userspace
(e.g., qemu poll-us). However, treating the sendmsg as the "busy"
period by keeping kicks disabled during the final sendmsg and
performing one additional ring check afterward provides a significant
performance improvement without any excess busy poll cycles.

If messages are found in the ring after the final sendmsg, requeue the
TX handler. This ensures fairness for the RX handler and allows
vhost_run_work_list to cond_resched() as needed.

Test Case
    TX VM: taskset -c 2 iperf3  -c rx-ip-here -t 60 -p 5200 -b 0 -u -i 5
    RX VM: taskset -c 2 iperf3 -s -p 5200 -D
    6.12.0, each worker backed by tun interface with IFF_NAPI setup.
    Note: TCP side is largely unchanged as that was copy bound

6.12.0 unpatched
    EPT_MISCONFIG/second: 5411
    Datagrams/second: ~382k
    Interval         Transfer     Bitrate         Lost/Total Datagrams
    0.00-30.00  sec  15.5 GBytes  4.43 Gbits/sec  0/11481630 (0%)  sender

6.12.0 patched
    EPT_MISCONFIG/second: 58 (~93x reduction)
    Datagrams/second: ~650k  (~1.7x increase)
    Interval         Transfer     Bitrate         Lost/Total Datagrams
    0.00-30.00  sec  26.4 GBytes  7.55 Gbits/sec  0/19554720 (0%)  sender

Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: Jon Kohler &lt;jon@nutanix.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Link: https://patch.msgid.link/20250501020428.1889162-1-jon@nutanix.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In handle_tx_copy, TX batching processes packets below ~PAGE_SIZE and
batches up to 64 messages before calling sock-&gt;sendmsg.

Currently, when there are no more messages on the ring to dequeue,
handle_tx_copy re-enables kicks on the ring *before* firing off the
batch sendmsg. However, sock-&gt;sendmsg incurs a non-zero delay,
especially if it needs to wake up a thread (e.g., another vhost worker).

If the guest submits additional messages immediately after the last ring
check and disablement, it triggers an EPT_MISCONFIG vmexit to attempt to
kick the vhost worker. This may happen while the worker is still
processing the sendmsg, leading to wasteful exit(s).

This is particularly problematic for single-threaded guest submission
threads, as they must exit, wait for the exit to be processed
(potentially involving a TTWU), and then resume.

In scenarios like a constant stream of UDP messages, this results in a
sawtooth pattern where the submitter frequently vmexits, and the
vhost-net worker alternates between sleeping and waking.

A common solution is to configure vhost-net busy polling via userspace
(e.g., qemu poll-us). However, treating the sendmsg as the "busy"
period by keeping kicks disabled during the final sendmsg and
performing one additional ring check afterward provides a significant
performance improvement without any excess busy poll cycles.

If messages are found in the ring after the final sendmsg, requeue the
TX handler. This ensures fairness for the RX handler and allows
vhost_run_work_list to cond_resched() as needed.

Test Case
    TX VM: taskset -c 2 iperf3  -c rx-ip-here -t 60 -p 5200 -b 0 -u -i 5
    RX VM: taskset -c 2 iperf3 -s -p 5200 -D
    6.12.0, each worker backed by tun interface with IFF_NAPI setup.
    Note: TCP side is largely unchanged as that was copy bound

6.12.0 unpatched
    EPT_MISCONFIG/second: 5411
    Datagrams/second: ~382k
    Interval         Transfer     Bitrate         Lost/Total Datagrams
    0.00-30.00  sec  15.5 GBytes  4.43 Gbits/sec  0/11481630 (0%)  sender

6.12.0 patched
    EPT_MISCONFIG/second: 58 (~93x reduction)
    Datagrams/second: ~650k  (~1.7x increase)
    Interval         Transfer     Bitrate         Lost/Total Datagrams
    0.00-30.00  sec  26.4 GBytes  7.55 Gbits/sec  0/19554720 (0%)  sender

Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Signed-off-by: Jon Kohler &lt;jon@nutanix.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Link: https://patch.msgid.link/20250501020428.1889162-1-jon@nutanix.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost/net: Set num_buffers for virtio 1.0</title>
<updated>2025-01-27T14:39:25+00:00</updated>
<author>
<name>Akihiko Odaki</name>
<email>akihiko.odaki@daynix.com</email>
</author>
<published>2024-09-15T01:35:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a3b9c053d82a9e524746f5473ad1bd18e9894bfa'/>
<id>a3b9c053d82a9e524746f5473ad1bd18e9894bfa</id>
<content type='text'>
The specification says the device MUST set num_buffers to 1 if
VIRTIO_NET_F_MRG_RXBUF has not been negotiated.

Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
Signed-off-by: Akihiko Odaki &lt;akihiko.odaki@daynix.com&gt;
Message-Id: &lt;20240915-v1-v1-1-f10d2cb5e759@daynix.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The specification says the device MUST set num_buffers to 1 if
VIRTIO_NET_F_MRG_RXBUF has not been negotiated.

Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
Signed-off-by: Akihiko Odaki &lt;akihiko.odaki@daynix.com&gt;
Message-Id: &lt;20240915-v1-v1-1-f10d2cb5e759@daynix.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: page_frag: avoid caller accessing 'page_frag_cache' directly</title>
<updated>2024-11-11T18:56:27+00:00</updated>
<author>
<name>Yunsheng Lin</name>
<email>linyunsheng@huawei.com</email>
</author>
<published>2024-10-28T11:53:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3d18dfe69ce46f106af327736d2261d7e3ee81c0'/>
<id>3d18dfe69ce46f106af327736d2261d7e3ee81c0</id>
<content type='text'>
Use appropriate frag_page API instead of caller accessing
'page_frag_cache' directly.

CC: Andrew Morton &lt;akpm@linux-foundation.org&gt;
CC: Linux-MM &lt;linux-mm@kvack.org&gt;
Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Alexander Duyck &lt;alexanderduyck@fb.com&gt;
Acked-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Link: https://patch.msgid.link/20241028115343.3405838-5-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use appropriate frag_page API instead of caller accessing
'page_frag_cache' directly.

CC: Andrew Morton &lt;akpm@linux-foundation.org&gt;
CC: Linux-MM &lt;linux-mm@kvack.org&gt;
Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Reviewed-by: Alexander Duyck &lt;alexanderduyck@fb.com&gt;
Acked-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Link: https://patch.msgid.link/20241028115343.3405838-5-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: extend ubuf_info callback to ops structure</title>
<updated>2024-04-22T23:21:35+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2024-04-19T11:08:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7ab4f16f9e2440e797eae88812f800458e5879d2'/>
<id>7ab4f16f9e2440e797eae88812f800458e5879d2</id>
<content type='text'>
We'll need to associate additional callbacks with ubuf_info, introduce
a structure holding ubuf_info callbacks. Apart from a more smarter
io_uring notification management introduced in next patches, it can be
used to generalise msg_zerocopy_put_abort() and also store
-&gt;sg_from_iter, which is currently passed in struct msghdr.

Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Link: https://lore.kernel.org/all/a62015541de49c0e2a8a0377a1d5d0a5aeb07016.1713369317.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We'll need to associate additional callbacks with ubuf_info, introduce
a structure holding ubuf_info callbacks. Apart from a more smarter
io_uring notification management introduced in next patches, it can be
used to generalise msg_zerocopy_put_abort() and also store
-&gt;sg_from_iter, which is currently passed in struct msghdr.

Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Link: https://lore.kernel.org/all/a62015541de49c0e2a8a0377a1d5d0a5aeb07016.1713369317.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost</title>
<updated>2024-03-19T15:57:39+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-19T15:57:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d95fcdf4961d27a3d17e5c7728367197adc89b8d'/>
<id>d95fcdf4961d27a3d17e5c7728367197adc89b8d</id>
<content type='text'>
Pull virtio updates from Michael Tsirkin:

 - Per vq sizes in vdpa

 - Info query for block devices support in vdpa

 - DMA sync callbacks in vduse

 - Fixes, cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (35 commits)
  virtio_net: rename free_old_xmit_skbs to free_old_xmit
  virtio_net: unify the code for recycling the xmit ptr
  virtio-net: add cond_resched() to the command waiting loop
  virtio-net: convert rx mode setting to use workqueue
  virtio: packed: fix unmap leak for indirect desc table
  vDPA: report virtio-blk flush info to user space
  vDPA: report virtio-block read-only info to user space
  vDPA: report virtio-block write zeroes configuration to user space
  vDPA: report virtio-block discarding configuration to user space
  vDPA: report virtio-block topology info to user space
  vDPA: report virtio-block MQ info to user space
  vDPA: report virtio-block max segments in a request to user space
  vDPA: report virtio-block block-size to user space
  vDPA: report virtio-block max segment size to user space
  vDPA: report virtio-block capacity to user space
  virtio: make virtio_bus const
  vdpa: make vdpa_bus const
  vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min
  vDPA/ifcvf: get_max_vq_size to return max size
  virtio_vdpa: create vqs with the actual size
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull virtio updates from Michael Tsirkin:

 - Per vq sizes in vdpa

 - Info query for block devices support in vdpa

 - DMA sync callbacks in vduse

 - Fixes, cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (35 commits)
  virtio_net: rename free_old_xmit_skbs to free_old_xmit
  virtio_net: unify the code for recycling the xmit ptr
  virtio-net: add cond_resched() to the command waiting loop
  virtio-net: convert rx mode setting to use workqueue
  virtio: packed: fix unmap leak for indirect desc table
  vDPA: report virtio-blk flush info to user space
  vDPA: report virtio-block read-only info to user space
  vDPA: report virtio-block write zeroes configuration to user space
  vDPA: report virtio-block discarding configuration to user space
  vDPA: report virtio-block topology info to user space
  vDPA: report virtio-block MQ info to user space
  vDPA: report virtio-block max segments in a request to user space
  vDPA: report virtio-block block-size to user space
  vDPA: report virtio-block max segment size to user space
  vDPA: report virtio-block capacity to user space
  virtio: make virtio_bus const
  vdpa: make vdpa_bus const
  vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min
  vDPA/ifcvf: get_max_vq_size to return max size
  virtio_vdpa: create vqs with the actual size
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost: Added pad cleanup if vnet_hdr is not present.</title>
<updated>2024-03-19T06:45:49+00:00</updated>
<author>
<name>Andrew Melnychenko</name>
<email>andrew@daynix.com</email>
</author>
<published>2024-01-15T19:48:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f6baca2d32ead51b10e15f1eef812d7c6a7b9a40'/>
<id>f6baca2d32ead51b10e15f1eef812d7c6a7b9a40</id>
<content type='text'>
When the Qemu launched with vhost but without tap vnet_hdr,
vhost tries to copy vnet_hdr from socket iter with size 0
to the page that may contain some trash.
That trash can be interpreted as unpredictable values for
vnet_hdr.
That leads to dropping some packets and in some cases to
stalling vhost routine when the vhost_net tries to process
packets and fails in a loop.

Qemu options:
  -netdev tap,vhost=on,vnet_hdr=off,...

Signed-off-by: Andrew Melnychenko &lt;andrew@daynix.com&gt;
Message-Id: &lt;20240115194840.1183077-1-andrew@daynix.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the Qemu launched with vhost but without tap vnet_hdr,
vhost tries to copy vnet_hdr from socket iter with size 0
to the page that may contain some trash.
That trash can be interpreted as unpredictable values for
vnet_hdr.
That leads to dropping some packets and in some cases to
stalling vhost routine when the vhost_net tries to process
packets and fails in a loop.

Qemu options:
  -netdev tap,vhost=on,vnet_hdr=off,...

Signed-off-by: Andrew Melnychenko &lt;andrew@daynix.com&gt;
Message-Id: &lt;20240115194840.1183077-1-andrew@daynix.com&gt;
Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vhost/net: remove vhost_net_page_frag_refill()</title>
<updated>2024-03-05T10:38:14+00:00</updated>
<author>
<name>Yunsheng Lin</name>
<email>linyunsheng@huawei.com</email>
</author>
<published>2024-02-28T09:30:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4051bd8129ac6c1faf447264aa7a8f91feb778b8'/>
<id>4051bd8129ac6c1faf447264aa7a8f91feb778b8</id>
<content type='text'>
The page frag in vhost_net_page_frag_refill() uses the
'struct page_frag' from skb_page_frag_refill(), but it's
implementation is similar to page_frag_alloc_align() now.

This patch removes vhost_net_page_frag_refill() by using
'struct page_frag_cache' instead of 'struct page_frag',
and allocating frag using page_frag_alloc_align().

The added benefit is that not only unifying the page frag
implementation a little, but also having about 0.5% performance
boost testing by using the vhost_net_test introduced in the
last patch.

Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The page frag in vhost_net_page_frag_refill() uses the
'struct page_frag' from skb_page_frag_refill(), but it's
implementation is similar to page_frag_alloc_align() now.

This patch removes vhost_net_page_frag_refill() by using
'struct page_frag_cache' instead of 'struct page_frag',
and allocating frag using page_frag_alloc_align().

The added benefit is that not only unifying the page frag
implementation a little, but also having about 0.5% performance
boost testing by using the vhost_net_test introduced in the
last patch.

Signed-off-by: Yunsheng Lin &lt;linyunsheng@huawei.com&gt;
Acked-by: Jason Wang &lt;jasowang@redhat.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
