<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/io_uring/io_uring.c, branch v6.5-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>io_uring: Use io_schedule* in cqring wait</title>
<updated>2023-07-07T17:24:29+00:00</updated>
<author>
<name>Andres Freund</name>
<email>andres@anarazel.de</email>
</author>
<published>2023-07-07T16:20:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8a796565cec3601071cbbd27d6304e202019d014'/>
<id>8a796565cec3601071cbbd27d6304e202019d014</id>
<content type='text'>
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().

The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0

This is repeatable with different filesystems, using raw block devices
and using different block devices.

Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.

After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).

There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.

Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund &lt;andres@anarazel.de&gt;
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I observed poor performance of io_uring compared to synchronous IO. That
turns out to be caused by deeper CPU idle states entered with io_uring,
due to io_uring using plain schedule(), whereas synchronous IO uses
io_schedule().

The losses due to this are substantial. On my cascade lake workstation,
t/io_uring from the fio repository e.g. yields regressions between 20%
and 40% with the following command:
./t/io_uring -r 5 -X0 -d 1 -s 1 -c 1 -p 0 -S$use_sync -R 0 /mnt/t2/fio/write.0.0

This is repeatable with different filesystems, using raw block devices
and using different block devices.

Use io_schedule_prepare() / io_schedule_finish() in
io_cqring_wait_schedule() to address the difference.

After that using io_uring is on par or surpassing synchronous IO (using
registered files etc makes it reliably win, but arguably is a less fair
comparison).

There are other calls to schedule() in io_uring/, but none immediately
jump out to be similarly situated, so I did not touch them. Similarly,
it's possible that mutex_lock_io() should be used, but it's not clear if
there are cases where that matters.

Cc: stable@vger.kernel.org # 5.10+
Cc: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Cc: io-uring@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Andres Freund &lt;andres@anarazel.de&gt;
Link: https://lore.kernel.org/r/20230707162007.194068-1-andres@anarazel.de
[axboe: minor style fixup]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: flush offloaded and delayed task_work on exit</title>
<updated>2023-06-28T17:06:05+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2023-06-28T17:06:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dfbe5561ae9339516a3742a3fbd678609ad59fd0'/>
<id>dfbe5561ae9339516a3742a3fbd678609ad59fd0</id>
<content type='text'>
io_uring offloads task_work for cancelation purposes when the task is
exiting. This is conceptually fine, but we should be nicer and actually
wait for that work to complete before returning.

Add an argument to io_fallback_tw() telling it to flush the deferred
work when it's all queued up, and have it flush a ctx behind whenever
the ctx changes.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
io_uring offloads task_work for cancelation purposes when the task is
exiting. This is conceptually fine, but we should be nicer and actually
wait for that work to complete before returning.

Add an argument to io_fallback_tw() telling it to flush the deferred
work when it's all queued up, and have it flush a ctx behind whenever
the ctx changes.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: remove io_fallback_tw() forward declaration</title>
<updated>2023-06-27T22:07:24+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2023-06-27T17:57:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=10e1c0d59006c6492d380602aa0a6c4eb9441426'/>
<id>10e1c0d59006c6492d380602aa0a6c4eb9441426</id>
<content type='text'>
It's used just one function higher up, get rid of the declaration and
just move it up a bit.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's used just one function higher up, get rid of the declaration and
just move it up a bit.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: merge conditional unlock flush helpers</title>
<updated>2023-06-23T14:19:40+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-06-23T11:23:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c98c81a4ac37b651be7eb9d16f562fc4acc5f867'/>
<id>c98c81a4ac37b651be7eb9d16f562fc4acc5f867</id>
<content type='text'>
There is no reason not to use __io_cq_unlock_post_flush for intermediate
aux CQE flushing, all -&gt;task_complete should apply there, i.e. if set it
should be the submitter task. Combine them, get rid of of
__io_cq_unlock_post() and rename the left function.

This place was also taking a couple percents of CPU according to
profiles for max throughput net benchmarks due to multishot recv
flooding it with completions.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/bbed60734cbec2e833d9c7bdcf9741aada5d8aab.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no reason not to use __io_cq_unlock_post_flush for intermediate
aux CQE flushing, all -&gt;task_complete should apply there, i.e. if set it
should be the submitter task. Combine them, get rid of of
__io_cq_unlock_post() and rename the left function.

This place was also taking a couple percents of CPU according to
profiles for max throughput net benchmarks due to multishot recv
flooding it with completions.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/bbed60734cbec2e833d9c7bdcf9741aada5d8aab.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: make io_cq_unlock_post static</title>
<updated>2023-06-23T14:19:40+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-06-23T11:23:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0fdb9a196c6728b51e0e7a4f6fa292d9fd5793de'/>
<id>0fdb9a196c6728b51e0e7a4f6fa292d9fd5793de</id>
<content type='text'>
io_cq_unlock_post() is exclusively used in io_uring/io_uring.c, mark it
static and don't expose to other files.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/3dc8127dda4514e1dd24bb32035faac887c5fa37.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
io_cq_unlock_post() is exclusively used in io_uring/io_uring.c, mark it
static and don't expose to other files.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/3dc8127dda4514e1dd24bb32035faac887c5fa37.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: inline __io_cq_unlock</title>
<updated>2023-06-23T14:19:40+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-06-23T11:23:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ff12617728fa5c7fb5325e164503ca4e936b80bd'/>
<id>ff12617728fa5c7fb5325e164503ca4e936b80bd</id>
<content type='text'>
__io_cq_unlock is not very helpful, and users should be calling flush
variants anyway. Open code the function.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/d875c4cfb69f38ccecb58a57111446c77a614caa.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__io_cq_unlock is not very helpful, and users should be calling flush
variants anyway. Open code the function.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/d875c4cfb69f38ccecb58a57111446c77a614caa.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: fix acquire/release annotations</title>
<updated>2023-06-23T14:19:40+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-06-23T11:23:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=55b6a69fed5df6f88ef0b2ace562b422162beb61'/>
<id>55b6a69fed5df6f88ef0b2ace562b422162beb61</id>
<content type='text'>
We do conditional locking, so __io_cq_lock() and friends not always
actually grab/release the lock, so kill misleading annotations.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/2a098f9144c24cab622f8bf90b39f44da5d0401e.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We do conditional locking, so __io_cq_lock() and friends not always
actually grab/release the lock, so kill misleading annotations.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/2a098f9144c24cab622f8bf90b39f44da5d0401e.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: kill io_cq_unlock()</title>
<updated>2023-06-23T14:19:39+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-06-23T11:23:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f432b76bcc93f36edb3d371f7b8d7881261dd6e7'/>
<id>f432b76bcc93f36edb3d371f7b8d7881261dd6e7</id>
<content type='text'>
We're abusing -&gt;completion_lock helpers. io_cq_unlock() neither
locking conditionally nor doing CQE flushing, which means that callers
must have some side reason of taking the lock and should do it directly.

Open code io_cq_unlock() into io_cqring_overflow_kill() and clean it up.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/7dabb36856db2b562e78780480396c52c29b2bf4.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We're abusing -&gt;completion_lock helpers. io_cq_unlock() neither
locking conditionally nor doing CQE flushing, which means that callers
must have some side reason of taking the lock and should do it directly.

Open code io_cq_unlock() into io_cqring_overflow_kill() and clean it up.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/7dabb36856db2b562e78780480396c52c29b2bf4.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: remove IOU_F_TWQ_FORCE_NORMAL</title>
<updated>2023-06-23T14:19:39+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-06-23T11:23:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=91c7884ac9a92ffbf78af7fc89603daf24f448a9'/>
<id>91c7884ac9a92ffbf78af7fc89603daf24f448a9</id>
<content type='text'>
Extract a function for non-local task_work_add, and use it directly from
io_move_task_work_from_local(). Now we don't use IOU_F_TWQ_FORCE_NORMAL
and it can be killed.

As a small positive side effect we don't grab task-&gt;io_uring in
io_req_normal_work_add anymore, which is not needed for
io_req_local_work_add().

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/2e55571e8ff2927ae3cc12da606d204e2485525b.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Extract a function for non-local task_work_add, and use it directly from
io_move_task_work_from_local(). Now we don't use IOU_F_TWQ_FORCE_NORMAL
and it can be killed.

As a small positive side effect we don't grab task-&gt;io_uring in
io_req_normal_work_add anymore, which is not needed for
io_req_local_work_add().

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/2e55571e8ff2927ae3cc12da606d204e2485525b.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: don't batch task put on reqs free</title>
<updated>2023-06-23T14:19:39+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2023-06-23T11:23:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2fdd6fb5ff958a0f6b403e3f3ffd645b60b2a2b2'/>
<id>2fdd6fb5ff958a0f6b403e3f3ffd645b60b2a2b2</id>
<content type='text'>
We're trying to batch io_put_task() in io_free_batch_list(), but
considering that the hot path is a simple inc, it's most cerainly and
probably faster to just do io_put_task() instead of task tracking.

We don't care about io_put_task_remote() as it's only for IOPOLL
where polling/waiting is done by not the submitter task.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/4a7ef7dce845fe2bd35507bf389d6bd2d5c1edf0.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We're trying to batch io_put_task() in io_free_batch_list(), but
considering that the hot path is a simple inc, it's most cerainly and
probably faster to just do io_put_task() instead of task tracking.

We don't care about io_put_task_remote() as it's only for IOPOLL
where polling/waiting is done by not the submitter task.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/4a7ef7dce845fe2bd35507bf389d6bd2d5c1edf0.1687518903.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
