<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/io_uring/rw.c, branch v6.0</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>io_uring/rw: fix error'ed retry return values</title>
<updated>2022-09-13T13:47:11+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-09-13T12:21:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=62bb0647b14646fa6c9aa25ecdf67ad18f13523c'/>
<id>62bb0647b14646fa6c9aa25ecdf67ad18f13523c</id>
<content type='text'>
Kernel test robot reports that we test negativity of an unsigned in
io_fixup_rw_res() after a recent change, which masks error codes and
messes up the return value in case I/O is re-retried and failed with
an error.

Fixes: 4d9cb92ca41dd ("io_uring/rw: fix short rw error handling")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/9754a0970af1861e7865f9014f735c70dc60bf79.1663071587.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Kernel test robot reports that we test negativity of an unsigned in
io_fixup_rw_res() after a recent change, which masks error codes and
messes up the return value in case I/O is re-retried and failed with
an error.

Fixes: 4d9cb92ca41dd ("io_uring/rw: fix short rw error handling")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/9754a0970af1861e7865f9014f735c70dc60bf79.1663071587.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/rw: fix short rw error handling</title>
<updated>2022-09-09T14:57:57+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-09-09T11:11:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4d9cb92ca41dd8e905a4569ceba4716c2f39c75a'/>
<id>4d9cb92ca41dd8e905a4569ceba4716c2f39c75a</id>
<content type='text'>
We have a couple of problems, first reports of unexpected link breakage
for reads when cqe-&gt;res indicates that the IO was done in full. The
reason here is partial IO with retries.

TL;DR; we compare the result in __io_complete_rw_common() against
req-&gt;cqe.res, but req-&gt;cqe.res doesn't store the full length but rather
the length left to be done. So, when we pass the full corrected result
via kiocb_done() -&gt; __io_complete_rw_common(), it fails.

The second problem is that we don't try to correct res in
io_complete_rw(), which, for instance, might be a problem for O_DIRECT
but when a prefix of data was cached in the page cache. We also
definitely don't want to pass a corrected result into io_rw_done().

The fix here is to leave __io_complete_rw_common() alone, always pass
not corrected result into it and fix it up as the last step just before
actually finishing the I/O.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://github.com/axboe/liburing/issues/643
Reported-by: Beld Zhang &lt;beldzhang@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We have a couple of problems, first reports of unexpected link breakage
for reads when cqe-&gt;res indicates that the IO was done in full. The
reason here is partial IO with retries.

TL;DR; we compare the result in __io_complete_rw_common() against
req-&gt;cqe.res, but req-&gt;cqe.res doesn't store the full length but rather
the length left to be done. So, when we pass the full corrected result
via kiocb_done() -&gt; __io_complete_rw_common(), it fails.

The second problem is that we don't try to correct res in
io_complete_rw(), which, for instance, might be a problem for O_DIRECT
but when a prefix of data was cached in the page cache. We also
definitely don't want to pass a corrected result into io_rw_done().

The fix here is to leave __io_complete_rw_common() alone, always pass
not corrected result into it and fix it up as the last step just before
actually finishing the I/O.

Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://github.com/axboe/liburing/issues/643
Reported-by: Beld Zhang &lt;beldzhang@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'io_uring-6.0-2022-08-13' of git://git.kernel.dk/linux-block</title>
<updated>2022-08-13T20:28:54+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-13T20:28:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1da8cf961bb13f4c3ea11373696b5ac986a47cde'/>
<id>1da8cf961bb13f4c3ea11373696b5ac986a47cde</id>
<content type='text'>
Pull io_uring fixes from Jens Axboe:

 - Regression fix for this merge window, fixing a wrong order of
   arguments for io_req_set_res() for passthru (Dylan)

 - Fix for the audit code leaking context memory (Peilin)

 - Ensure that provided buffers are memcg accounted (Pavel)

 - Correctly handle short zero-copy sends (Pavel)

 - Sparse warning fixes for the recvmsg multishot command (Dylan)

 - Error handling fix for passthru (Anuj)

 - Remove randomization of struct kiocb fields, to avoid it growing in
   size if re-arranged in such a fashion that it grows more holes or
   padding (Keith, Linus)

 - Small series improving type safety of the sqe fields (Stefan)

* tag 'io_uring-6.0-2022-08-13' of git://git.kernel.dk/linux-block:
  io_uring: add missing BUILD_BUG_ON() checks for new io_uring_sqe fields
  io_uring: make io_kiocb_to_cmd() typesafe
  fs: don't randomize struct kiocb fields
  io_uring: consistently make use of io_notif_to_data()
  io_uring: fix error handling for io_uring_cmd
  io_uring: fix io_recvmsg_prep_multishot sparse warnings
  io_uring/net: send retry for zerocopy
  io_uring: mem-account pbuf buckets
  audit, io_uring, io-wq: Fix memory leak in io_sq_thread() and io_wqe_worker()
  io_uring: pass correct parameters to io_req_set_res
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull io_uring fixes from Jens Axboe:

 - Regression fix for this merge window, fixing a wrong order of
   arguments for io_req_set_res() for passthru (Dylan)

 - Fix for the audit code leaking context memory (Peilin)

 - Ensure that provided buffers are memcg accounted (Pavel)

 - Correctly handle short zero-copy sends (Pavel)

 - Sparse warning fixes for the recvmsg multishot command (Dylan)

 - Error handling fix for passthru (Anuj)

 - Remove randomization of struct kiocb fields, to avoid it growing in
   size if re-arranged in such a fashion that it grows more holes or
   padding (Keith, Linus)

 - Small series improving type safety of the sqe fields (Stefan)

* tag 'io_uring-6.0-2022-08-13' of git://git.kernel.dk/linux-block:
  io_uring: add missing BUILD_BUG_ON() checks for new io_uring_sqe fields
  io_uring: make io_kiocb_to_cmd() typesafe
  fs: don't randomize struct kiocb fields
  io_uring: consistently make use of io_notif_to_data()
  io_uring: fix error handling for io_uring_cmd
  io_uring: fix io_recvmsg_prep_multishot sparse warnings
  io_uring/net: send retry for zerocopy
  io_uring: mem-account pbuf buckets
  audit, io_uring, io-wq: Fix memory leak in io_sq_thread() and io_wqe_worker()
  io_uring: pass correct parameters to io_req_set_res
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: make io_kiocb_to_cmd() typesafe</title>
<updated>2022-08-12T23:01:00+00:00</updated>
<author>
<name>Stefan Metzmacher</name>
<email>metze@samba.org</email>
</author>
<published>2022-08-11T07:11:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f2ccb5aed7bce1d8b3ed5b3385759a5509663028'/>
<id>f2ccb5aed7bce1d8b3ed5b3385759a5509663028</id>
<content type='text'>
We need to make sure (at build time) that struct io_cmd_data is not
casted to a structure that's larger.

Signed-off-by: Stefan Metzmacher &lt;metze@samba.org&gt;
Link: https://lore.kernel.org/r/c024cdf25ae19fc0319d4180e2298bade8ed17b8.1660201408.git.metze@samba.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We need to make sure (at build time) that struct io_cmd_data is not
casted to a structure that's larger.

Signed-off-by: Stefan Metzmacher &lt;metze@samba.org&gt;
Link: https://lore.kernel.org/r/c024cdf25ae19fc0319d4180e2298bade8ed17b8.1660201408.git.metze@samba.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'pull-work.iov_iter-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2022-08-03T20:50:22+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-03T20:50:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5264406cdb66c7003eb3edf53c9773b1b20611b9'/>
<id>5264406cdb66c7003eb3edf53c9773b1b20611b9</id>
<content type='text'>
Pull vfs iov_iter updates from Al Viro:
 "Part 1 - isolated cleanups and optimizations.

  One of the goals is to reduce the overhead of using -&gt;read_iter() and
  -&gt;write_iter() instead of -&gt;read()/-&gt;write().

  new_sync_{read,write}() has a surprising amount of overhead, in
  particular inside iocb_flags(). That's the explanation for the
  beginning of the series is in this pile; it's not directly
  iov_iter-related, but it's a part of the same work..."

* tag 'pull-work.iov_iter-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  first_iovec_segment(): just return address
  iov_iter: massage calling conventions for first_{iovec,bvec}_segment()
  iov_iter: first_{iovec,bvec}_segment() - simplify a bit
  iov_iter: lift dealing with maxpages out of first_{iovec,bvec}_segment()
  iov_iter_get_pages{,_alloc}(): cap the maxsize with MAX_RW_COUNT
  iov_iter_bvec_advance(): don't bother with bvec_iter
  copy_page_{to,from}_iter(): switch iovec variants to generic
  keep iocb_flags() result cached in struct file
  iocb: delay evaluation of IS_SYNC(...) until we want to check IOCB_DSYNC
  struct file: use anonymous union member for rcuhead and llist
  btrfs: use IOMAP_DIO_NOSYNC
  teach iomap_dio_rw() to suppress dsync
  No need of likely/unlikely on calls of check_copy_size()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull vfs iov_iter updates from Al Viro:
 "Part 1 - isolated cleanups and optimizations.

  One of the goals is to reduce the overhead of using -&gt;read_iter() and
  -&gt;write_iter() instead of -&gt;read()/-&gt;write().

  new_sync_{read,write}() has a surprising amount of overhead, in
  particular inside iocb_flags(). That's the explanation for the
  beginning of the series is in this pile; it's not directly
  iov_iter-related, but it's a part of the same work..."

* tag 'pull-work.iov_iter-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  first_iovec_segment(): just return address
  iov_iter: massage calling conventions for first_{iovec,bvec}_segment()
  iov_iter: first_{iovec,bvec}_segment() - simplify a bit
  iov_iter: lift dealing with maxpages out of first_{iovec,bvec}_segment()
  iov_iter_get_pages{,_alloc}(): cap the maxsize with MAX_RW_COUNT
  iov_iter_bvec_advance(): don't bother with bvec_iter
  copy_page_{to,from}_iter(): switch iovec variants to generic
  keep iocb_flags() result cached in struct file
  iocb: delay evaluation of IS_SYNC(...) until we want to check IOCB_DSYNC
  struct file: use anonymous union member for rcuhead and llist
  btrfs: use IOMAP_DIO_NOSYNC
  teach iomap_dio_rw() to suppress dsync
  No need of likely/unlikely on calls of check_copy_size()
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: Add tracepoint for short writes</title>
<updated>2022-07-25T00:39:32+00:00</updated>
<author>
<name>Stefan Roesch</name>
<email>shr@fb.com</email>
</author>
<published>2022-06-16T21:22:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1c849b481b3e4f8c36f297cd3aa88ef52a19cee9'/>
<id>1c849b481b3e4f8c36f297cd3aa88ef52a19cee9</id>
<content type='text'>
This adds the io_uring_short_write tracepoint to io_uring. A short write
is issued if not all pages that are required for a write are in the page
cache and the async buffered writes have to return EAGAIN.

Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Link: https://lore.kernel.org/r/20220616212221.2024518-13-shr@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds the io_uring_short_write tracepoint to io_uring. A short write
is issued if not all pages that are required for a write are in the page
cache and the async buffered writes have to return EAGAIN.

Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Link: https://lore.kernel.org/r/20220616212221.2024518-13-shr@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: fix issue with io_write() not always undoing sb_start_write()</title>
<updated>2022-07-25T00:39:32+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2022-06-24T16:24:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e053aaf4da56cbf0afb33a0fda4a62188e2c0637'/>
<id>e053aaf4da56cbf0afb33a0fda4a62188e2c0637</id>
<content type='text'>
This is actually an older issue, but we never used to hit the -EAGAIN
path before having done sb_start_write(). Make sure that we always call
kiocb_end_write() if we need to retry the write, so that we keep the
calls to sb_start_write() etc balanced.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is actually an older issue, but we never used to hit the -EAGAIN
path before having done sb_start_write(). Make sure that we always call
kiocb_end_write() if we need to retry the write, so that we keep the
calls to sb_start_write() etc balanced.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: Add support for async buffered writes</title>
<updated>2022-07-25T00:39:32+00:00</updated>
<author>
<name>Stefan Roesch</name>
<email>shr@fb.com</email>
</author>
<published>2022-06-16T21:22:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4e17aaab54359fa2cdeb0080c822a08f2980f979'/>
<id>4e17aaab54359fa2cdeb0080c822a08f2980f979</id>
<content type='text'>
This enables the async buffered writes for the filesystems that support
async buffered writes in io-uring. Buffered writes are enabled for
blocks that are already in the page cache or can be acquired with noio.

Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Link: https://lore.kernel.org/r/20220616212221.2024518-12-shr@fb.com
[axboe: adapt to 5.20 branch]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This enables the async buffered writes for the filesystems that support
async buffered writes in io-uring. Buffered writes are enabled for
blocks that are already in the page cache or can be acquired with noio.

Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Link: https://lore.kernel.org/r/20220616212221.2024518-12-shr@fb.com
[axboe: adapt to 5.20 branch]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: ensure REQ_F_ISREG is set async offload</title>
<updated>2022-07-25T00:39:18+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2022-07-21T15:06:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f6b543fd03d347e8bf245cee4f2d54eb6ffd8fcb'/>
<id>f6b543fd03d347e8bf245cee4f2d54eb6ffd8fcb</id>
<content type='text'>
If we're offloading requests directly to io-wq because IOSQE_ASYNC was
set in the sqe, we can miss hashing writes appropriately because we
haven't set REQ_F_ISREG yet. This can cause a performance regression
with buffered writes, as io-wq then no longer correctly serializes writes
to that file.

Ensure that we set the flags in io_prep_async_work(), which will cause
the io-wq work item to be hashed appropriately.

Fixes: 584b0180f0f4 ("io_uring: move read/write file prep state into actual opcode handler")
Link: https://lore.kernel.org/io-uring/20220608080054.GB22428@xsang-OptiPlex-9020/
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Tested-by: Yin Fengwei &lt;fengwei.yin@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we're offloading requests directly to io-wq because IOSQE_ASYNC was
set in the sqe, we can miss hashing writes appropriately because we
haven't set REQ_F_ISREG yet. This can cause a performance regression
with buffered writes, as io-wq then no longer correctly serializes writes
to that file.

Ensure that we set the flags in io_prep_async_work(), which will cause
the io-wq work item to be hashed appropriately.

Fixes: 584b0180f0f4 ("io_uring: move read/write file prep state into actual opcode handler")
Link: https://lore.kernel.org/io-uring/20220608080054.GB22428@xsang-OptiPlex-9020/
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Tested-by: Yin Fengwei &lt;fengwei.yin@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: remove priority tw list optimisation</title>
<updated>2022-07-25T00:39:15+00:00</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@fb.com</email>
</author>
<published>2022-06-22T13:40:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ed5ccb3beeba0cadb0fcf353ae192021dfecf252'/>
<id>ed5ccb3beeba0cadb0fcf353ae192021dfecf252</id>
<content type='text'>
This optimisation has some built in assumptions that make it easy to
introduce bugs. It also does not have clear wins that make it worth keeping.

Signed-off-by: Dylan Yudaken &lt;dylany@fb.com&gt;
Link: https://lore.kernel.org/r/20220622134028.2013417-2-dylany@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This optimisation has some built in assumptions that make it easy to
introduce bugs. It also does not have clear wins that make it worth keeping.

Signed-off-by: Dylan Yudaken &lt;dylany@fb.com&gt;
Link: https://lore.kernel.org/r/20220622134028.2013417-2-dylany@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
