<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/io_uring/io_uring.c, branch v6.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>io_uring: always prep_async for drain requests</title>
<updated>2023-01-27T13:29:29+00:00</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@meta.com</email>
</author>
<published>2023-01-27T10:59:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ef5c600adb1d985513d2b612cc90403a148ff287'/>
<id>ef5c600adb1d985513d2b612cc90403a148ff287</id>
<content type='text'>
Drain requests all go through io_drain_req, which has a quick exit in case
there is nothing pending (ie the drain is not useful). In that case it can
run the issue the request immediately.

However for safety it queues it through task work.
The problem is that in this case the request is run asynchronously, but
the async work has not been prepared through io_req_prep_async.

This has not been a problem up to now, as the task work always would run
before returning to userspace, and so the user would not have a chance to
race with it.

However - with IORING_SETUP_DEFER_TASKRUN - this is no longer the case and
the work might be defered, giving userspace a chance to change data being
referred to in the request.

Instead _always_ prep_async for drain requests, which is simpler anyway
and removes this issue.

Cc: stable@vger.kernel.org
Fixes: c0e0d6ba25f1 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20230127105911.2420061-1-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Drain requests all go through io_drain_req, which has a quick exit in case
there is nothing pending (ie the drain is not useful). In that case it can
run the issue the request immediately.

However for safety it queues it through task work.
The problem is that in this case the request is run asynchronously, but
the async work has not been prepared through io_req_prep_async.

This has not been a problem up to now, as the task work always would run
before returning to userspace, and so the user would not have a chance to
race with it.

However - with IORING_SETUP_DEFER_TASKRUN - this is no longer the case and
the work might be defered, giving userspace a chance to change data being
referred to in the request.

Instead _always_ prep_async for drain requests, which is simpler anyway
and removes this issue.

Cc: stable@vger.kernel.org
Fixes: c0e0d6ba25f1 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;
Link: https://lore.kernel.org/r/20230127105911.2420061-1-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/msg_ring: fix remote queue to disabled ring</title>
<updated>2023-01-20T16:49:34+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-01-20T16:38:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8579538c89e33ce78be2feb41e07489c8cbf8f31'/>
<id>8579538c89e33ce78be2feb41e07489c8cbf8f31</id>
<content type='text'>
IORING_SETUP_R_DISABLED rings don't have the submitter task set, so
it's not always safe to use -&gt;submitter_task. Disallow posting msg_ring
messaged to disabled rings. Also add task NULL check for loosy sync
around testing for IORING_SETUP_R_DISABLED.

Cc: stable@vger.kernel.org
Fixes: 6d043ee1164ca ("io_uring: do msg_ring in target task via tw")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
IORING_SETUP_R_DISABLED rings don't have the submitter task set, so
it's not always safe to use -&gt;submitter_task. Disallow posting msg_ring
messaged to disabled rings. Also add task NULL check for loosy sync
around testing for IORING_SETUP_R_DISABLED.

Cc: stable@vger.kernel.org
Fixes: 6d043ee1164ca ("io_uring: do msg_ring in target task via tw")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: fix CQ waiting timeout handling</title>
<updated>2023-01-05T15:04:47+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-01-05T10:49:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=12521a5d5cb7ff0ad43eadfc9c135d86e1131fa8'/>
<id>12521a5d5cb7ff0ad43eadfc9c135d86e1131fa8</id>
<content type='text'>
Jiffy to ktime CQ waiting conversion broke how we treat timeouts, in
particular we rearm it anew every time we get into
io_cqring_wait_schedule() without adjusting the timeout. Waiting for 2
CQEs and getting a task_work in the middle may double the timeout value,
or even worse in some cases task may wait indefinitely.

Cc: stable@vger.kernel.org
Fixes: 228339662b398 ("io_uring: don't convert to jiffies for waiting on timeouts")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/f7bffddd71b08f28a877d44d37ac953ddb01590d.1672915663.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Jiffy to ktime CQ waiting conversion broke how we treat timeouts, in
particular we rearm it anew every time we get into
io_cqring_wait_schedule() without adjusting the timeout. Waiting for 2
CQEs and getting a task_work in the middle may double the timeout value,
or even worse in some cases task may wait indefinitely.

Cc: stable@vger.kernel.org
Fixes: 228339662b398 ("io_uring: don't convert to jiffies for waiting on timeouts")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/f7bffddd71b08f28a877d44d37ac953ddb01590d.1672915663.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: lockdep annotate CQ locking</title>
<updated>2023-01-04T02:05:41+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-01-04T01:34:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f26cc9593581bd734c846bf827401350b36dc3c9'/>
<id>f26cc9593581bd734c846bf827401350b36dc3c9</id>
<content type='text'>
Locking around CQE posting is complex and depends on options the ring is
created with, add more thorough lockdep annotations checking all
invariants.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/aa3770b4eacae3915d782cc2ab2f395a99b4b232.1672795976.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Locking around CQE posting is complex and depends on options the ring is
created with, add more thorough lockdep annotations checking all
invariants.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/aa3770b4eacae3915d782cc2ab2f395a99b4b232.1672795976.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: pin context while queueing deferred tw</title>
<updated>2023-01-04T02:03:28+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-01-04T01:34:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9ffa13ff78a0a55df968a72d6f0ebffccee5c9f4'/>
<id>9ffa13ff78a0a55df968a72d6f0ebffccee5c9f4</id>
<content type='text'>
Unlike normal tw, nothing prevents deferred tw to be executed right
after an tw item added to -&gt;work_llist in io_req_local_work_add(). For
instance, the waiting task may get waken up by CQ posting or a normal
tw. Thus we need to pin the ring for the rest of io_req_local_work_add()

Cc: stable@vger.kernel.org
Fixes: c0e0d6ba25f18 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/1a79362b9c10b8523ef70b061d96523650a23344.1672795998.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Unlike normal tw, nothing prevents deferred tw to be executed right
after an tw item added to -&gt;work_llist in io_req_local_work_add(). For
instance, the waiting task may get waken up by CQ posting or a normal
tw. Thus we need to pin the ring for the rest of io_req_local_work_add()

Cc: stable@vger.kernel.org
Fixes: c0e0d6ba25f18 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/1a79362b9c10b8523ef70b061d96523650a23344.1672795998.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: check for valid register opcode earlier</title>
<updated>2022-12-23T13:40:32+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2022-12-23T13:37:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=343190841a1f22b96996d9f8cfab902a4d1bfd0e'/>
<id>343190841a1f22b96996d9f8cfab902a4d1bfd0e</id>
<content type='text'>
We only check the register opcode value inside the restricted ring
section, move it into the main io_uring_register() function instead
and check it up front.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We only check the register opcode value inside the restricted ring
section, move it into the main io_uring_register() function instead
and check it up front.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: finish waiting before flushing overflow entries</title>
<updated>2022-12-21T15:43:53+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2022-12-21T14:05:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=52ea806ad983490b3132a9e526e11a10dc2fd10c'/>
<id>52ea806ad983490b3132a9e526e11a10dc2fd10c</id>
<content type='text'>
If we have overflow entries being generated after we've done the
initial flush in io_cqring_wait(), then we could be flushing them in the
main wait loop as well. If that's done after having added ourselves
to the cq_wait waitqueue, then the task state can be != TASK_RUNNING
when we enter the overflow flush.

Check for the need to overflow flush, and finish our wait cycle first
if we have to do so.

Reported-and-tested-by: syzbot+cf6ea1d6bb30a4ce10b2@syzkaller.appspotmail.com
Link: https://lore.kernel.org/io-uring/000000000000cb143a05f04eee15@google.com/
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we have overflow entries being generated after we've done the
initial flush in io_cqring_wait(), then we could be flushing them in the
main wait loop as well. If that's done after having added ourselves
to the cq_wait waitqueue, then the task state can be != TASK_RUNNING
when we enter the overflow flush.

Check for the need to overflow flush, and finish our wait cycle first
if we have to do so.

Reported-and-tested-by: syzbot+cf6ea1d6bb30a4ce10b2@syzkaller.appspotmail.com
Link: https://lore.kernel.org/io-uring/000000000000cb143a05f04eee15@google.com/
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: include task_work run after scheduling in wait for events</title>
<updated>2022-12-18T03:35:54+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2022-12-17T20:42:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=35d90f95cfa773b7e3b1f57ba15ce06a470f354c'/>
<id>35d90f95cfa773b7e3b1f57ba15ce06a470f354c</id>
<content type='text'>
It's quite possible that we got woken up because task_work was queued,
and we need to process this task_work to generate the events waited for.
If we return to the wait loop without running task_work, we'll end up
adding the task to the waitqueue again, only to call
io_cqring_wait_schedule() again which will run the task_work. This is
less efficient than it could be, as it requires adding to the cq_wait
queue again. It also triggers the wakeup path for completions as
cq_wait is now non-empty with the task itself, and it'll require another
lock grab and deletion to remove ourselves from the waitqueue.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's quite possible that we got woken up because task_work was queued,
and we need to process this task_work to generate the events waited for.
If we return to the wait loop without running task_work, we'll end up
adding the task to the waitqueue again, only to call
io_cqring_wait_schedule() again which will run the task_work. This is
less efficient than it could be, as it requires adding to the cq_wait
queue again. It also triggers the wakeup path for completions as
cq_wait is now non-empty with the task itself, and it'll require another
lock grab and deletion to remove ourselves from the waitqueue.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: use call_rcu_hurry if signaling an eventfd</title>
<updated>2022-12-15T18:59:29+00:00</updated>
<author>
<name>Dylan Yudaken</name>
<email>dylany@meta.com</email>
</author>
<published>2022-12-15T18:41:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=44a84da45272b3f4beb90025a64cfbde18f1aef0'/>
<id>44a84da45272b3f4beb90025a64cfbde18f1aef0</id>
<content type='text'>
io_uring uses call_rcu in the case it needs to signal an eventfd as a
result of an eventfd signal, since recursing eventfd signals are not
allowed. This should be calling the new call_rcu_hurry API to not delay
the signal.

Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;

Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Acked-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/r/20221215184138.795576-1-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
io_uring uses call_rcu in the case it needs to signal an eventfd as a
result of an eventfd signal, since recursing eventfd signals are not
allowed. This should be calling the new call_rcu_hurry API to not delay
the signal.

Signed-off-by: Dylan Yudaken &lt;dylany@meta.com&gt;

Cc: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Acked-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/r/20221215184138.795576-1-dylany@meta.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: fix overflow handling regression</title>
<updated>2022-12-15T15:20:10+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2022-12-02T17:47:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a8cf95f93610eb8282f8b6d0117ba78b74588d6b'/>
<id>a8cf95f93610eb8282f8b6d0117ba78b74588d6b</id>
<content type='text'>
Because the single task locking series got reordered ahead of the
timeout and completion lock changes, two hunks inadvertently ended up
using __io_fill_cqe_req() rather than io_fill_cqe_req(). This meant
that we dropped overflow handling in those two spots. Reinstate the
correct CQE filling helper.

Fixes: f66f73421f0a ("io_uring: skip spinlocking for -&gt;task_complete")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because the single task locking series got reordered ahead of the
timeout and completion lock changes, two hunks inadvertently ended up
using __io_fill_cqe_req() rather than io_fill_cqe_req(). This meant
that we dropped overflow handling in those two spots. Reinstate the
correct CQE filling helper.

Fixes: f66f73421f0a ("io_uring: skip spinlocking for -&gt;task_complete")
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
