<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/io_uring.c, branch v5.5</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>io_uring: don't cancel all work on process exit</title>
<updated>2020-01-26T17:17:12+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-01-26T17:17:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ebe10026210f9ea740b9a050ee84a166690fddde'/>
<id>ebe10026210f9ea740b9a050ee84a166690fddde</id>
<content type='text'>
If we're sharing the ring across forks, then one process exiting means
that we cancel ALL work and prevent future work. This is overly
restrictive. As long as we cancel the work associated with the files
from the current task, it's safe to let others persist. Normal fd close
on exit will still wait (and cancel) pending work.

Fixes: fcb323cc53e2 ("io_uring: io_uring: add support for async work inheriting files")
Reported-by: Andres Freund &lt;andres@anarazel.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we're sharing the ring across forks, then one process exiting means
that we cancel ALL work and prevent future work. This is overly
restrictive. As long as we cancel the work associated with the files
from the current task, it's safe to let others persist. Normal fd close
on exit will still wait (and cancel) pending work.

Fixes: fcb323cc53e2 ("io_uring: io_uring: add support for async work inheriting files")
Reported-by: Andres Freund &lt;andres@anarazel.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "io_uring: only allow submit from owning task"</title>
<updated>2020-01-26T16:56:05+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-01-26T16:53:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=73e08e711d9c1d79fae01daed4b0e1fee5f8a275'/>
<id>73e08e711d9c1d79fae01daed4b0e1fee5f8a275</id>
<content type='text'>
This ends up being too restrictive for tasks that willingly fork and
share the ring between forks. Andres reports that this breaks his
postgresql work. Since we're close to 5.5 release, revert this change
for now.

Cc: stable@vger.kernel.org
Fixes: 44d282796f81 ("io_uring: only allow submit from owning task")
Reported-by: Andres Freund &lt;andres@anarazel.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This ends up being too restrictive for tasks that willingly fork and
share the ring between forks. Andres reports that this breaks his
postgresql work. Since we're close to 5.5 release, revert this change
for now.

Cc: stable@vger.kernel.org
Fixes: 44d282796f81 ("io_uring: only allow submit from owning task")
Reported-by: Andres Freund &lt;andres@anarazel.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: fix compat for IORING_REGISTER_FILES_UPDATE</title>
<updated>2020-01-21T00:00:44+00:00</updated>
<author>
<name>Eugene Syromiatnikov</name>
<email>esyr@redhat.com</email>
</author>
<published>2020-01-15T16:35:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1292e972fff2b2d81e139e0c2fe5f50249e78c58'/>
<id>1292e972fff2b2d81e139e0c2fe5f50249e78c58</id>
<content type='text'>
fds field of struct io_uring_files_update is problematic with regards
to compat user space, as pointer size is different in 32-bit, 32-on-64-bit,
and 64-bit user space.  In order to avoid custom handling of compat in
the syscall implementation, make fds __u64 and use u64_to_user_ptr in
order to retrieve it.  Also, align the field naturally and check that
no garbage is passed there.

Fixes: c3a31e605620c279 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE")
Signed-off-by: Eugene Syromiatnikov &lt;esyr@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fds field of struct io_uring_files_update is problematic with regards
to compat user space, as pointer size is different in 32-bit, 32-on-64-bit,
and 64-bit user space.  In order to avoid custom handling of compat in
the syscall implementation, make fds __u64 and use u64_to_user_ptr in
order to retrieve it.  Also, align the field naturally and check that
no garbage is passed there.

Fixes: c3a31e605620c279 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE")
Signed-off-by: Eugene Syromiatnikov &lt;esyr@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: only allow submit from owning task</title>
<updated>2020-01-17T04:43:24+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-01-17T02:00:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=44d282796f81eb1debc1d7cb53245b4cb3214cb5'/>
<id>44d282796f81eb1debc1d7cb53245b4cb3214cb5</id>
<content type='text'>
If the credentials or the mm doesn't match, don't allow the task to
submit anything on behalf of this ring. The task that owns the ring can
pass the file descriptor to another task, but we don't want to allow
that task to submit an SQE that then assumes the ring mm and creds if
it needs to go async.

Cc: stable@vger.kernel.org
Suggested-by: Stefan Metzmacher &lt;metze@samba.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the credentials or the mm doesn't match, don't allow the task to
submit anything on behalf of this ring. The task that owns the ring can
pass the file descriptor to another task, but we don't want to allow
that task to submit an SQE that then assumes the ring mm and creds if
it needs to go async.

Cc: stable@vger.kernel.org
Suggested-by: Stefan Metzmacher &lt;metze@samba.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: ensure workqueue offload grabs ring mutex for poll list</title>
<updated>2020-01-16T04:51:17+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-01-16T04:51:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=11ba820bf163e224bf5dd44e545a66a44a5b1d7a'/>
<id>11ba820bf163e224bf5dd44e545a66a44a5b1d7a</id>
<content type='text'>
A previous commit moved the locking for the async sqthread, but didn't
take into account that the io-wq workers still need it. We can't use
req-&gt;in_async for this anymore as both the sqthread and io-wq workers
set it, gate the need for locking on io_wq_current_is_worker() instead.

Fixes: 8a4955ff1cca ("io_uring: sqthread should grab ctx-&gt;uring_lock for submissions")
Reported-by: Bijan Mottahedeh &lt;bijan.mottahedeh@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A previous commit moved the locking for the async sqthread, but didn't
take into account that the io-wq workers still need it. We can't use
req-&gt;in_async for this anymore as both the sqthread and io-wq workers
set it, gate the need for locking on io_wq_current_is_worker() instead.

Fixes: 8a4955ff1cca ("io_uring: sqthread should grab ctx-&gt;uring_lock for submissions")
Reported-by: Bijan Mottahedeh &lt;bijan.mottahedeh@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: clear req-&gt;result always before issuing a read/write request</title>
<updated>2020-01-16T04:36:13+00:00</updated>
<author>
<name>Bijan Mottahedeh</name>
<email>bijan.mottahedeh@oracle.com</email>
</author>
<published>2020-01-16T02:37:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=797f3f535d59f05ad12c629338beef6cb801d19e'/>
<id>797f3f535d59f05ad12c629338beef6cb801d19e</id>
<content type='text'>
req-&gt;result is cleared when io_issue_sqe() calls io_read/write_pre()
routines.  Those routines however are not called when the sqe
argument is NULL, which is the case when io_issue_sqe() is called from
io_wq_submit_work().  io_issue_sqe() may then examine a stale result if
a polled request had previously failed with -EAGAIN:

        if (ctx-&gt;flags &amp; IORING_SETUP_IOPOLL) {
                if (req-&gt;result == -EAGAIN)
                        return -EAGAIN;

                io_iopoll_req_issued(req);
        }

and in turn cause a subsequently completed request to be re-issued in
io_wq_submit_work().

Signed-off-by: Bijan Mottahedeh &lt;bijan.mottahedeh@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
req-&gt;result is cleared when io_issue_sqe() calls io_read/write_pre()
routines.  Those routines however are not called when the sqe
argument is NULL, which is the case when io_issue_sqe() is called from
io_wq_submit_work().  io_issue_sqe() may then examine a stale result if
a polled request had previously failed with -EAGAIN:

        if (ctx-&gt;flags &amp; IORING_SETUP_IOPOLL) {
                if (req-&gt;result == -EAGAIN)
                        return -EAGAIN;

                io_iopoll_req_issued(req);
        }

and in turn cause a subsequently completed request to be re-issued in
io_wq_submit_work().

Signed-off-by: Bijan Mottahedeh &lt;bijan.mottahedeh@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: be consistent in assigning next work from handler</title>
<updated>2020-01-15T05:09:06+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-01-15T05:09:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=78912934f4f7dd7a424159c69bf9bdd46e823781'/>
<id>78912934f4f7dd7a424159c69bf9bdd46e823781</id>
<content type='text'>
If we pass back dependent work in case of links, we need to always
ensure that we call the link setup and work prep handler. If not, we
might be missing some setup for the next work item.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we pass back dependent work in case of links, we need to always
ensure that we call the link setup and work prep handler. If not, we
might be missing some setup for the next work item.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: don't setup async context for read/write fixed</title>
<updated>2020-01-14T02:25:29+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-01-14T02:23:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=74566df3a71c1b92da608868cca787557d8be7b2'/>
<id>74566df3a71c1b92da608868cca787557d8be7b2</id>
<content type='text'>
We don't need it, and if we have it, then the retry handler will attempt
to copy the non-existent iovec with the inline iovec, with a segment
count that doesn't make sense.

Fixes: f67676d160c6 ("io_uring: ensure async punted read/write requests copy iovec")
Reported-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We don't need it, and if we have it, then the retry handler will attempt
to copy the non-existent iovec with the inline iovec, with a segment
count that doesn't make sense.

Fixes: f67676d160c6 ("io_uring: ensure async punted read/write requests copy iovec")
Reported-by: Jonathan Lemon &lt;jonathan.lemon@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: remove punt of short reads to async context</title>
<updated>2020-01-07T20:08:56+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-01-07T20:08:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=eacc6dfaea963ef61540abb31ad7829be5eff284'/>
<id>eacc6dfaea963ef61540abb31ad7829be5eff284</id>
<content type='text'>
We currently punt any short read on a regular file to async context,
but this fails if the short read is due to running into EOF. This is
especially problematic since we only do the single prep for commands
now, as we don't reset kiocb-&gt;ki_pos. This can result in a 4k read on
a 1k file returning zero, as we detect the short read and then retry
from async context. At the time of retry, the position is now 1k, and
we end up reading nothing, and hence return 0.

Instead of trying to patch around the fact that short reads can be
legitimate and won't succeed in case of retry, remove the logic to punt
a short read to async context. Simply return it.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We currently punt any short read on a regular file to async context,
but this fails if the short read is due to running into EOF. This is
especially problematic since we only do the single prep for commands
now, as we don't reset kiocb-&gt;ki_pos. This can result in a 4k read on
a 1k file returning zero, as we detect the short read and then retry
from async context. At the time of retry, the position is now 1k, and
we end up reading nothing, and hence return 0.

Instead of trying to patch around the fact that short reads can be
legitimate and won't succeed in case of retry, remove the logic to punt
a short read to async context. Simply return it.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: pass in 'sqe' to the prep handlers</title>
<updated>2019-12-20T17:04:50+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2019-12-20T01:24:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3529d8c2b353e6e446277ae96a36e7471cb070fc'/>
<id>3529d8c2b353e6e446277ae96a36e7471cb070fc</id>
<content type='text'>
This moves the prep handlers outside of the opcode handlers, and allows
us to pass in the sqe directly. If the sqe is non-NULL, it means that
the request should be prepared for the first time.

With the opcode handlers not having access to the sqe at all, we are
guaranteed that the prep handler has setup the request fully by the
time we get there. As before, for opcodes that need to copy in more
data then the io_kiocb allows for, the io_async_ctx holds that info. If
a prep handler is invoked with req-&gt;io set, it must use that to retain
information for later.

Finally, we can remove io_kiocb-&gt;sqe as well.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This moves the prep handlers outside of the opcode handlers, and allows
us to pass in the sqe directly. If the sqe is non-NULL, it means that
the request should be prepared for the first time.

With the opcode handlers not having access to the sqe at all, we are
guaranteed that the prep handler has setup the request fully by the
time we get there. As before, for opcodes that need to copy in more
data then the io_kiocb allows for, the io_async_ctx holds that info. If
a prep handler is invoked with req-&gt;io set, it must use that to retain
information for later.

Finally, we can remove io_kiocb-&gt;sqe as well.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
