<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/io_uring/Makefile, branch v6.15</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>io_uring/epoll: remove CONFIG_EPOLL guards</title>
<updated>2025-02-20T14:59:56+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-01-31T21:19:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0fb3f5600c5e583a1cf9dbe803ba7ef89f7fbc65'/>
<id>0fb3f5600c5e583a1cf9dbe803ba7ef89f7fbc65</id>
<content type='text'>
Just have the Makefile add the object if epoll is enabled, then it's
not necessary to guard the entire epoll.c file inside an CONFIG_EPOLL
ifdef.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Just have the Makefile add the object if epoll is enabled, then it's
not necessary to guard the entire epoll.c file inside an CONFIG_EPOLL
ifdef.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/zcrx: add interface queue and refill queue</title>
<updated>2025-02-17T12:41:03+00:00</updated>
<author>
<name>David Wei</name>
<email>dw@davidwei.uk</email>
</author>
<published>2025-02-15T00:09:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6f377873cb23905009759b7366b9fe85c2a6ff37'/>
<id>6f377873cb23905009759b7366b9fe85c2a6ff37</id>
<content type='text'>
Add a new object called an interface queue (ifq) that represents a net
rx queue that has been configured for zero copy. Each ifq is registered
using a new registration opcode IORING_REGISTER_ZCRX_IFQ.

The refill queue is allocated by the kernel and mapped by userspace
using a new offset IORING_OFF_RQ_RING, in a similar fashion to the main
SQ/CQ. It is used by userspace to return buffers that it is done with,
which will then be re-used by the netdev again.

The main CQ ring is used to notify userspace of received data by using
the upper 16 bytes of a big CQE as a new struct io_uring_zcrx_cqe. Each
entry contains the offset + len to the data.

For now, each io_uring instance only has a single ifq.

Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: David Wei &lt;dw@davidwei.uk&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/r/20250215000947.789731-2-dw@davidwei.uk
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new object called an interface queue (ifq) that represents a net
rx queue that has been configured for zero copy. Each ifq is registered
using a new registration opcode IORING_REGISTER_ZCRX_IFQ.

The refill queue is allocated by the kernel and mapped by userspace
using a new offset IORING_OFF_RQ_RING, in a similar fashion to the main
SQ/CQ. It is used by userspace to return buffers that it is done with,
which will then be re-used by the netdev again.

The main CQ ring is used to notify userspace of received data by using
the upper 16 bytes of a big CQE as a new struct io_uring_zcrx_cqe. Each
entry contains the offset + len to the data.

For now, each io_uring instance only has a single ifq.

Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: David Wei &lt;dw@davidwei.uk&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/r/20250215000947.789731-2-dw@davidwei.uk
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: add alloc_cache.c</title>
<updated>2025-01-28T22:10:40+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2025-01-28T20:56:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d19af0e9366298aa60afc0fb51ffcbd6205edcee'/>
<id>d19af0e9366298aa60afc0fb51ffcbd6205edcee</id>
<content type='text'>
Avoid inlining all and everything from alloc_cache.h and move cold bits
into a new file.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Reviewed-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Link: https://lore.kernel.org/r/06984c6cd58e703f7cfae5ab3067912f9f635a06.1738087204.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Avoid inlining all and everything from alloc_cache.h and move cold bits
into a new file.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Reviewed-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Link: https://lore.kernel.org/r/06984c6cd58e703f7cfae5ab3067912f9f635a06.1738087204.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: add GCOV_PROFILE_URING Kconfig option</title>
<updated>2024-08-30T16:52:02+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-08-30T16:52:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1802656ef8906cc949f58b64cb6d8d400326e163'/>
<id>1802656ef8906cc949f58b64cb6d8d400326e163</id>
<content type='text'>
If GCOV is enabled and this option is set, it enables code coverage
profiling of the io_uring subsystem. Only use this for test purposes,
as it will impact the runtime performance.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If GCOV is enabled and this option is set, it enables code coverage
profiling of the io_uring subsystem. Only use this for test purposes,
as it will impact the runtime performance.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/eventfd: move eventfd handling to separate file</title>
<updated>2024-06-16T20:54:55+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-06-03T17:51:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=200f3abd14db55f9aadcb74f4e7a678f1c469ba1'/>
<id>200f3abd14db55f9aadcb74f4e7a678f1c469ba1</id>
<content type='text'>
This is pretty nicely abstracted already, but let's move it to a separate
file rather than have it in the main io_uring file. With that, we can
also move the io_ev_fd struct and enum out of global scope.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is pretty nicely abstracted already, but let's move it to a separate
file rather than have it in the main io_uring file. With that, we can
also move the io_ev_fd struct and enum out of global scope.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: move mapping/allocation helpers to a separate file</title>
<updated>2024-04-15T14:10:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-03-27T20:59:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f15ed8b4d0ce2c0831232ff85117418740f0c529'/>
<id>f15ed8b4d0ce2c0831232ff85117418740f0c529</id>
<content type='text'>
Move the related code from io_uring.c into memmap.c. No functional
changes in this patch, just cleaning it up a bit now that the full
transition is done.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the related code from io_uring.c into memmap.c. No functional
changes in this patch, just cleaning it up a bit now that the full
transition is done.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: re-arrange Makefile order</title>
<updated>2024-04-15T14:10:25+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-03-27T04:13:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=77a1cd5e795726235469c31cf13a25116bf2d712'/>
<id>77a1cd5e795726235469c31cf13a25116bf2d712</id>
<content type='text'>
The object list is a bit of a mess, with core and opcode files mixed in.
Re-arrange it so that we have the core bits first, and then opcode
specific files after that.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The object list is a bit of a mess, with core and opcode files mixed in.
Re-arrange it so that we have the core bits first, and then opcode
specific files after that.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io-uring: add napi busy poll support</title>
<updated>2024-02-09T18:54:19+00:00</updated>
<author>
<name>Stefan Roesch</name>
<email>shr@devkernel.io</email>
</author>
<published>2023-06-08T16:38:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8d0c12a80cdeb80d5e0510e96d38fe551ed8e9b5'/>
<id>8d0c12a80cdeb80d5e0510e96d38fe551ed8e9b5</id>
<content type='text'>
This adds the napi busy polling support in io_uring.c. It adds a new
napi_list to the io_ring_ctx structure. This list contains the list of
napi_id's that are currently enabled for busy polling. The list is
synchronized by the new napi_lock spin lock. The current default napi
busy polling time is stored in napi_busy_poll_to. If napi busy polling
is not enabled, the value is 0.

In addition there is also a hash table. The hash table store the napi
id and the pointer to the above list nodes. The hash table is used to
speed up the lookup to the list elements. The hash table is synchronized
with rcu.

The NAPI_TIMEOUT is stored as a timeout to make sure that the time a
napi entry is stored in the napi list is limited.

The busy poll timeout is also stored as part of the io_wait_queue. This
is necessary as for sq polling the poll interval needs to be adjusted
and the napi callback allows only to pass in one value.

This has been tested with two simple programs from the liburing library
repository: the napi client and the napi server program. The client
sends a request, which has a timestamp in its payload and the server
replies with the same payload. The client calculates the roundtrip time
and stores it to calculate the results.

The client is running on host1 and the server is running on host 2 (in
the same rack). The measured times below are roundtrip times. They are
average times over 5 runs each. Each run measures 1 million roundtrips.

                   no rx coal          rx coal: frames=88,usecs=33
Default              57us                    56us

client_poll=100us    47us                    46us

server_poll=100us    51us                    46us

client_poll=100us+   40us                    40us
server_poll=100us

client_poll=100us+   41us                    39us
server_poll=100us+
prefer napi busy poll on client

client_poll=100us+   41us                    39us
server_poll=100us+
prefer napi busy poll on server

client_poll=100us+   41us                    39us
server_poll=100us+
prefer napi busy poll on client + server

Signed-off-by: Stefan Roesch &lt;shr@devkernel.io&gt;
Suggested-by: Olivier Langlois &lt;olivier@trillion01.com&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/r/20230608163839.2891748-5-shr@devkernel.io
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds the napi busy polling support in io_uring.c. It adds a new
napi_list to the io_ring_ctx structure. This list contains the list of
napi_id's that are currently enabled for busy polling. The list is
synchronized by the new napi_lock spin lock. The current default napi
busy polling time is stored in napi_busy_poll_to. If napi busy polling
is not enabled, the value is 0.

In addition there is also a hash table. The hash table store the napi
id and the pointer to the above list nodes. The hash table is used to
speed up the lookup to the list elements. The hash table is synchronized
with rcu.

The NAPI_TIMEOUT is stored as a timeout to make sure that the time a
napi entry is stored in the napi list is limited.

The busy poll timeout is also stored as part of the io_wait_queue. This
is necessary as for sq polling the poll interval needs to be adjusted
and the napi callback allows only to pass in one value.

This has been tested with two simple programs from the liburing library
repository: the napi client and the napi server program. The client
sends a request, which has a timestamp in its payload and the server
replies with the same payload. The client calculates the roundtrip time
and stores it to calculate the results.

The client is running on host1 and the server is running on host 2 (in
the same rack). The measured times below are roundtrip times. They are
average times over 5 runs each. Each run measures 1 million roundtrips.

                   no rx coal          rx coal: frames=88,usecs=33
Default              57us                    56us

client_poll=100us    47us                    46us

server_poll=100us    51us                    46us

client_poll=100us+   40us                    40us
server_poll=100us

client_poll=100us+   41us                    39us
server_poll=100us+
prefer napi busy poll on client

client_poll=100us+   41us                    39us
server_poll=100us+
prefer napi busy poll on server

client_poll=100us+   41us                    39us
server_poll=100us+
prefer napi busy poll on client + server

Signed-off-by: Stefan Roesch &lt;shr@devkernel.io&gt;
Suggested-by: Olivier Langlois &lt;olivier@trillion01.com&gt;
Acked-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/r/20230608163839.2891748-5-shr@devkernel.io
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring: add support for ftruncate</title>
<updated>2024-02-09T16:04:39+00:00</updated>
<author>
<name>Tony Solomonik</name>
<email>tony.solomonik@gmail.com</email>
</author>
<published>2024-02-02T12:17:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b4bb1900c12e6a0fe11ff51e1aa6eea19a4ad635'/>
<id>b4bb1900c12e6a0fe11ff51e1aa6eea19a4ad635</id>
<content type='text'>
Adds support for doing truncate through io_uring, eliminating
the need for applications to roll their own thread pool or offload
mechanism to be able to do non-blocking truncates.

Signed-off-by: Tony Solomonik &lt;tony.solomonik@gmail.com&gt;
Link: https://lore.kernel.org/r/20240202121724.17461-3-tony.solomonik@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds support for doing truncate through io_uring, eliminating
the need for applications to roll their own thread pool or offload
mechanism to be able to do non-blocking truncates.

Signed-off-by: Tony Solomonik &lt;tony.solomonik@gmail.com&gt;
Link: https://lore.kernel.org/r/20240202121724.17461-3-tony.solomonik@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>io_uring/register: move io_uring_register(2) related code to register.c</title>
<updated>2023-12-19T15:54:20+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2023-12-19T15:54:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c43203154d8ac579537aa0c7802b77d463b1f53a'/>
<id>c43203154d8ac579537aa0c7802b77d463b1f53a</id>
<content type='text'>
Most of this code is basically self contained, move it out of the core
io_uring file to bring a bit more separation to the registration related
bits. This moves another ~10% of the code into register.c.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Most of this code is basically self contained, move it out of the core
io_uring file to bring a bit more separation to the registration related
bits. This moves another ~10% of the code into register.c.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
