<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/io_uring/net.c, branch v6.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>io_uring/net: cache provided buffer group value for multishot receives</title>
<updated>2023-01-23T14:08:08+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2023-01-22T17:02:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b00c51ef8f72ced0965d021a291b98ff822c5337'/>
<id>b00c51ef8f72ced0965d021a291b98ff822c5337</id>
<content type='text'>
If we're using ring provided buffers with multishot receive, and we end
up doing an io-wq based issue at some points that also needs to select
a buffer, we'll lose the initially assigned buffer group as
io_ring_buffer_select() correctly clears the buffer group list as the
issue isn't serialized by the ctx uring_lock. This is fine for normal
receives as the request puts the buffer and finishes, but for multishot,
we will re-arm and do further receives. On the next trigger for this
multishot receive, the receive will try and pick from a buffer group
whose value is the same as the buffer ID of the las receive. That is
obviously incorrect, and will result in a premature -ENOUFS error for
the receive even if we had available buffers in the correct group.

Cache the buffer group value at prep time, so we can restore it for
future receives. This only needs doing for the above mentioned case, but
just do it by default to keep it easier to read.

Cc: stable@vger.kernel.org
Fixes: b3fdea6ecb55 ("io_uring: multishot recv")
Fixes: 9bb66906f23e ("io_uring: support multishot in recvmsg")
Cc: Dylan Yudaken &lt;dylany@meta.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we're using ring provided buffers with multishot receive, and we end
up doing an io-wq based issue at some points that also needs to select
a buffer, we'll lose the initially assigned buffer group as
io_ring_buffer_select() correctly clears the buffer group list as the
issue isn't serialized by the ctx uring_lock. This is fine for normal
receives as the request puts the buffer and finishes, but for multishot,
we will re-arm and do further receives. On the next trigger for this
multishot receive, the receive will try and pick from a buffer group
whose value is the same as the buffer ID of the las receive. That is
obviously incorrect, and will result in a premature -ENOUFS error for
the receive even if we had available buffers in the correct group.

Cache the buffer group value at prep time, so we can restore it for
future receives. This only needs doing for the above mentioned case, but
just do it by default to keep it easier to read.

Cc: stable@vger.kernel.org
Fixes: b3fdea6ecb55 ("io_uring: multishot recv")
Fixes: 9bb66906f23e ("io_uring: support multishot in recvmsg")
Cc: Dylan Yudaken &lt;dylany@meta.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/net: fix cleanup after recycle</title>
<updated>2022-12-19T15:28:28+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-12-19T15:11:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6c3e8955d4bd9811a6e1761eea412a14fb51a2e6'/>
<id>6c3e8955d4bd9811a6e1761eea412a14fb51a2e6</id>
<content type='text'>
Don't access io_async_msghdr io_netmsg_recycle(), it may be reallocated.

Cc: stable@vger.kernel.org
Fixes: 9bb66906f23e5 ("io_uring: support multishot in recvmsg")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/9e326f4ad4046ddadf15bf34bf3fa58c6372f6b5.1671461985.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Don't access io_async_msghdr io_netmsg_recycle(), it may be reallocated.

Cc: stable@vger.kernel.org
Fixes: 9bb66906f23e5 ("io_uring: support multishot in recvmsg")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/9e326f4ad4046ddadf15bf34bf3fa58c6372f6b5.1671461985.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/net: ensure compat import handlers clear free_iov</title>
<updated>2022-12-19T14:35:16+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2022-12-19T14:28:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=990a4de57e44f4f4cfc33c90d2ec5d285b7c8342'/>
<id>990a4de57e44f4f4cfc33c90d2ec5d285b7c8342</id>
<content type='text'>
If we're not allocating the vectors because the count is below
UIO_FASTIOV, we still do need to properly clear -&gt;free_iov to prevent
an erronous free of on-stack data.

Reported-by: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Fixes: 4c17a496a7a0 ("io_uring/net: fix cleanup double free free_iov init")
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we're not allocating the vectors because the count is below
UIO_FASTIOV, we still do need to properly clear -&gt;free_iov to prevent
an erronous free of on-stack data.

Reported-by: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Fixes: 4c17a496a7a0 ("io_uring/net: fix cleanup double free free_iov init")
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-6.2/io_uring-next-2022-12-08' of git://git.kernel.dk/linux</title>
<updated>2022-12-13T18:40:31+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-12-13T18:40:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=96f7e448b9f4546ffd0356ffceb2b9586777f316'/>
<id>96f7e448b9f4546ffd0356ffceb2b9586777f316</id>
<content type='text'>
Pull io_uring updates part two from Jens Axboe:

 - Misc fixes (me, Lin)

 - Series from Pavel extending the single task exclusive ring mode,
   yielding nice improvements for the common case of having a single
   ring per thread (Pavel)

 - Cleanup for MSG_RING, removing our IOPOLL hack (Pavel)

 - Further poll cleanups and fixes (Pavel)

 - Misc cleanups and fixes (Pavel)

* tag 'for-6.2/io_uring-next-2022-12-08' of git://git.kernel.dk/linux: (22 commits)
  io_uring/msg_ring: flag target ring as having task_work, if needed
  io_uring: skip spinlocking for -&gt;task_complete
  io_uring: do msg_ring in target task via tw
  io_uring: extract a io_msg_install_complete helper
  io_uring: get rid of double locking
  io_uring: never run tw and fallback in parallel
  io_uring: use tw for putting rsrc
  io_uring: force multishot CQEs into task context
  io_uring: complete all requests in task context
  io_uring: don't check overflow flush failures
  io_uring: skip overflow CQE posting for dying ring
  io_uring: improve io_double_lock_ctx fail handling
  io_uring: dont remove file from msg_ring reqs
  io_uring: reshuffle issue_flags
  io_uring: don't reinstall quiesce node for each tw
  io_uring: improve rsrc quiesce refs checks
  io_uring: don't raw spin unlock to match cq_lock
  io_uring: combine poll tw handlers
  io_uring: improve poll warning handling
  io_uring: remove ctx variable in io_poll_check_events
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull io_uring updates part two from Jens Axboe:

 - Misc fixes (me, Lin)

 - Series from Pavel extending the single task exclusive ring mode,
   yielding nice improvements for the common case of having a single
   ring per thread (Pavel)

 - Cleanup for MSG_RING, removing our IOPOLL hack (Pavel)

 - Further poll cleanups and fixes (Pavel)

 - Misc cleanups and fixes (Pavel)

* tag 'for-6.2/io_uring-next-2022-12-08' of git://git.kernel.dk/linux: (22 commits)
  io_uring/msg_ring: flag target ring as having task_work, if needed
  io_uring: skip spinlocking for -&gt;task_complete
  io_uring: do msg_ring in target task via tw
  io_uring: extract a io_msg_install_complete helper
  io_uring: get rid of double locking
  io_uring: never run tw and fallback in parallel
  io_uring: use tw for putting rsrc
  io_uring: force multishot CQEs into task context
  io_uring: complete all requests in task context
  io_uring: don't check overflow flush failures
  io_uring: skip overflow CQE posting for dying ring
  io_uring: improve io_double_lock_ctx fail handling
  io_uring: dont remove file from msg_ring reqs
  io_uring: reshuffle issue_flags
  io_uring: don't reinstall quiesce node for each tw
  io_uring: improve rsrc quiesce refs checks
  io_uring: don't raw spin unlock to match cq_lock
  io_uring: combine poll tw handlers
  io_uring: improve poll warning handling
  io_uring: remove ctx variable in io_poll_check_events
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux</title>
<updated>2022-12-13T18:33:08+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-12-13T18:33:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=54e60e505d6144a22c787b5be1fdce996a27be1b'/>
<id>54e60e505d6144a22c787b5be1fdce996a27be1b</id>
<content type='text'>
Pull io_uring updates from Jens Axboe:

 - Always ensure proper ordering in case of CQ ring overflow, which then
   means we can remove some work-arounds for that (Dylan)

 - Support completion batching for multishot, greatly increasing the
   efficiency for those (Dylan)

 - Flag epoll/eventfd wakeups done from io_uring, so that we can easily
   tell if we're recursing into io_uring again.

   Previously, this would have resulted in repeated multishot
   notifications if we had a dependency there. That could happen if an
   eventfd was registered as the ring eventfd, and we multishot polled
   for events on it. Or if an io_uring fd was added to epoll, and
   io_uring had a multishot request for the epoll fd.

   Test cases here:
	https://git.kernel.dk/cgit/liburing/commit/?id=919755a7d0096fda08fb6d65ac54ad8d0fe027cd

   Previously these got terminated when the CQ ring eventually
   overflowed, now it's handled gracefully (me).

 - Tightening of the IOPOLL based completions (Pavel)

 - Optimizations of the networking zero-copy paths (Pavel)

 - Various tweaks and fixes (Dylan, Pavel)

* tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux: (41 commits)
  io_uring: keep unlock_post inlined in hot path
  io_uring: don't use complete_post in kbuf
  io_uring: spelling fix
  io_uring: remove io_req_complete_post_tw
  io_uring: allow multishot polled reqs to defer completion
  io_uring: remove overflow param from io_post_aux_cqe
  io_uring: add lockdep assertion in io_fill_cqe_aux
  io_uring: make io_fill_cqe_aux static
  io_uring: add io_aux_cqe which allows deferred completion
  io_uring: allow defer completion for aux posted cqes
  io_uring: defer all io_req_complete_failed
  io_uring: always lock in io_apoll_task_func
  io_uring: remove iopoll spinlock
  io_uring: iopoll protect complete_post
  io_uring: inline __io_req_complete_put()
  io_uring: remove io_req_tw_post_queue
  io_uring: use io_req_task_complete() in timeout
  io_uring: hold locks for io_req_complete_failed
  io_uring: add completion locking for iopoll
  io_uring: kill io_cqring_ev_posted() and __io_cq_unlock_post()
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull io_uring updates from Jens Axboe:

 - Always ensure proper ordering in case of CQ ring overflow, which then
   means we can remove some work-arounds for that (Dylan)

 - Support completion batching for multishot, greatly increasing the
   efficiency for those (Dylan)

 - Flag epoll/eventfd wakeups done from io_uring, so that we can easily
   tell if we're recursing into io_uring again.

   Previously, this would have resulted in repeated multishot
   notifications if we had a dependency there. That could happen if an
   eventfd was registered as the ring eventfd, and we multishot polled
   for events on it. Or if an io_uring fd was added to epoll, and
   io_uring had a multishot request for the epoll fd.

   Test cases here:
	https://git.kernel.dk/cgit/liburing/commit/?id=919755a7d0096fda08fb6d65ac54ad8d0fe027cd

   Previously these got terminated when the CQ ring eventually
   overflowed, now it's handled gracefully (me).

 - Tightening of the IOPOLL based completions (Pavel)

 - Optimizations of the networking zero-copy paths (Pavel)

 - Various tweaks and fixes (Dylan, Pavel)

* tag 'for-6.2/io_uring-2022-12-08' of git://git.kernel.dk/linux: (41 commits)
  io_uring: keep unlock_post inlined in hot path
  io_uring: don't use complete_post in kbuf
  io_uring: spelling fix
  io_uring: remove io_req_complete_post_tw
  io_uring: allow multishot polled reqs to defer completion
  io_uring: remove overflow param from io_post_aux_cqe
  io_uring: add lockdep assertion in io_fill_cqe_aux
  io_uring: make io_fill_cqe_aux static
  io_uring: add io_aux_cqe which allows deferred completion
  io_uring: allow defer completion for aux posted cqes
  io_uring: defer all io_req_complete_failed
  io_uring: always lock in io_apoll_task_func
  io_uring: remove iopoll spinlock
  io_uring: iopoll protect complete_post
  io_uring: inline __io_req_complete_put()
  io_uring: remove io_req_tw_post_queue
  io_uring: use io_req_task_complete() in timeout
  io_uring: hold locks for io_req_complete_failed
  io_uring: add completion locking for iopoll
  io_uring: kill io_cqring_ev_posted() and __io_cq_unlock_post()
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: force multishot CQEs into task context</title>
<updated>2022-12-07T13:47:13+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-12-07T03:53:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=17add5cea2bbafea0d481f1a3ea9dea019a98ee9'/>
<id>17add5cea2bbafea0d481f1a3ea9dea019a98ee9</id>
<content type='text'>
Multishot are posting CQEs outside of the normal request completion
path, which is usually done from within a task work handler. However, it
might be not the case when it's yet to be polled but has been punted to
io-wq. Make it abide -&gt;task_complete and push it to the polling path
when executed by io-wq.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/d7714aaff583096769a0f26e8e747759e556feb1.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Multishot are posting CQEs outside of the normal request completion
path, which is usually done from within a task work handler. However, it
might be not the case when it's yet to be polled but has been punted to
io-wq. Make it abide -&gt;task_complete and push it to the polling path
when executed by io-wq.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/d7714aaff583096769a0f26e8e747759e556feb1.1670384893.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>use less confusing names for iov_iter direction initializers</title>
<updated>2022-11-25T18:01:55+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2022-09-16T00:25:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=de4eda9de2d957ef2d6a8365a01e26a435e958cb'/>
<id>de4eda9de2d957ef2d6a8365a01e26a435e958cb</id>
<content type='text'>
READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: add io_aux_cqe which allows deferred completion</title>
<updated>2022-11-25T13:10:04+00:00</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@meta.com</email>
</author>
<published>2022-11-24T09:35:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9b8c54755a2b16d4f23c0ea184b75e2edf77d906'/>
<id>9b8c54755a2b16d4f23c0ea184b75e2edf77d906</id>
<content type='text'>
Use the just introduced deferred post cqe completion state when possible
in io_aux_cqe. If not possible fallback to io_post_aux_cqe.

This introduces a complication because of allow_overflow. For deferred
completions we cannot know without locking the completion_lock if it will
overflow (and even if we locked it, another post could sneak in and cause
this cqe to be in overflow).
However since overflow protection is mostly a best effort defence in depth
to prevent infinite loops of CQEs for poll, just checking the overflow bit
is going to be good enough and will result in at most 16 (array size of
deferred cqes) overflows.

Suggested-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221124093559.3780686-6-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the just introduced deferred post cqe completion state when possible
in io_aux_cqe. If not possible fallback to io_post_aux_cqe.

This introduces a complication because of allow_overflow. For deferred
completions we cannot know without locking the completion_lock if it will
overflow (and even if we locked it, another post could sneak in and cause
this cqe to be in overflow).
However since overflow protection is mostly a best effort defence in depth
to prevent infinite loops of CQEs for poll, just checking the overflow bit
is going to be good enough and will result in at most 16 (array size of
deferred cqes) overflows.

Suggested-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221124093559.3780686-6-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: allow multishot recv CQEs to overflow</title>
<updated>2022-11-21T14:44:09+00:00</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@meta.com</email>
</author>
<published>2022-11-07T12:52:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e2ad599d1ed38fe743106f10d58a0cbfc00b51e2'/>
<id>e2ad599d1ed38fe743106f10d58a0cbfc00b51e2</id>
<content type='text'>
With commit aa1df3a360a0 ("io_uring: fix CQE reordering"), there are
stronger guarantees for overflow ordering. Specifically ensuring that
userspace will not receive out of order receive CQEs. Therefore this is
not needed any more for recv/recvmsg.

Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221107125236.260132-4-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With commit aa1df3a360a0 ("io_uring: fix CQE reordering"), there are
stronger guarantees for overflow ordering. Specifically ensuring that
userspace will not receive out of order receive CQEs. Therefore this is
not needed any more for recv/recvmsg.

Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221107125236.260132-4-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: revert "io_uring fix multishot accept ordering"</title>
<updated>2022-11-21T14:43:42+00:00</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@meta.com</email>
</author>
<published>2022-11-21T14:43:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=515e26961295bee9da5e26916c27739dca6c10e1'/>
<id>515e26961295bee9da5e26916c27739dca6c10e1</id>
<content type='text'>
This is no longer needed after commit aa1df3a360a0 ("io_uring: fix CQE
reordering"), since all reordering is now taken care of.

This reverts commit cbd25748545c ("io_uring: fix multishot accept
ordering").

Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221107125236.260132-2-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is no longer needed after commit aa1df3a360a0 ("io_uring: fix CQE
reordering"), since all reordering is now taken care of.

This reverts commit cbd25748545c ("io_uring: fix multishot accept
ordering").

Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20221107125236.260132-2-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
