<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/linux/eventpoll.h, branch v7.2-rc1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge patch series "io_uring related epoll cleanups"</title>
<updated>2026-05-15T15:41:05+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2026-05-15T15:41:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6ece1a31c58c8c8293ecbbe79d7f92d52e1b0022'/>
<id>6ece1a31c58c8c8293ecbbe79d7f92d52e1b0022</id>
<content type='text'>
Jens Axboe &lt;axboe@kernel.dk&gt; says:

One of the nastier things about epoll is how it allows nesting contexts
inside each other, leading to the necessity of loop detection and the
issues that have come with that.

I don't believe there's any reason to support nesting on the io_uring
side, in fact IORING_OP_EPOLL_CTL is a historical mistake, imho. But
let's at least try and contain the damage and disallow nested contexts
from our side.

Christian Brauner &lt;brauner@kernel.org&gt; says:

Bring in the eventpoll specific io_uring changes together with the
eventpoll cleanup I did this cycle. The io_uring changes can go on top
of both through the block tree.

* patches from https://patch.msgid.link/20260514140817.623026-1-axboe@kernel.dk:
  eventpoll: rename struct epoll_filefd to epoll_key
  eventpoll: add file based control interface
  eventpoll: export is_file_epoll()
  eventpoll: pass struct epoll_filefd through ep_find() and ep_insert()

Link: https://patch.msgid.link/20260514140817.623026-1-axboe@kernel.dk
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Jens Axboe &lt;axboe@kernel.dk&gt; says:

One of the nastier things about epoll is how it allows nesting contexts
inside each other, leading to the necessity of loop detection and the
issues that have come with that.

I don't believe there's any reason to support nesting on the io_uring
side, in fact IORING_OP_EPOLL_CTL is a historical mistake, imho. But
let's at least try and contain the damage and disallow nested contexts
from our side.

Christian Brauner &lt;brauner@kernel.org&gt; says:

Bring in the eventpoll specific io_uring changes together with the
eventpoll cleanup I did this cycle. The io_uring changes can go on top
of both through the block tree.

* patches from https://patch.msgid.link/20260514140817.623026-1-axboe@kernel.dk:
  eventpoll: rename struct epoll_filefd to epoll_key
  eventpoll: add file based control interface
  eventpoll: export is_file_epoll()
  eventpoll: pass struct epoll_filefd through ep_find() and ep_insert()

Link: https://patch.msgid.link/20260514140817.623026-1-axboe@kernel.dk
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventpoll: rename struct epoll_filefd to epoll_key</title>
<updated>2026-05-15T15:34:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-05-14T14:07:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=463aba8090738bcd956297f7341b6874e76ae664'/>
<id>463aba8090738bcd956297f7341b6874e76ae664</id>
<content type='text'>
This more accurately describes what purpose this structure serves, as
a lookup key.

Suggested-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-5-axboe@kernel.dk
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This more accurately describes what purpose this structure serves, as
a lookup key.

Suggested-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-5-axboe@kernel.dk
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventpoll: add file based control interface</title>
<updated>2026-05-15T15:34:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-05-14T14:07:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0995baf261ce8fc141768911d97ba76e854e26cf'/>
<id>0995baf261ce8fc141768911d97ba76e854e26cf</id>
<content type='text'>
Add do_epoll_ctl_file(), which takes a pre-resolved epoll file and a
struct epoll_filefd for the target rather than two integer file
descriptors. do_epoll_ctl() remains as a thin wrapper.

In preparation for using the file based interface from io_uring.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-4-axboe@kernel.dk
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add do_epoll_ctl_file(), which takes a pre-resolved epoll file and a
struct epoll_filefd for the target rather than two integer file
descriptors. do_epoll_ctl() remains as a thin wrapper.

In preparation for using the file based interface from io_uring.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-4-axboe@kernel.dk
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventpoll: export is_file_epoll()</title>
<updated>2026-05-15T15:34:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2026-05-14T14:07:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5de2759f2b7c925f187e552cae47775acd5f4b40'/>
<id>5de2759f2b7c925f187e552cae47775acd5f4b40</id>
<content type='text'>
Make is_file_epoll() available outside of epoll. This is in preparation
from using it from io_uring.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-3-axboe@kernel.dk
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make is_file_epoll() available outside of epoll. This is in preparation
from using it from io_uring.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://patch.msgid.link/20260514140817.623026-3-axboe@kernel.dk
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventpoll: refresh eventpoll_release() fast-path comment</title>
<updated>2026-04-23T22:36:50+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2026-04-23T09:56:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=33e92e9ecf48c08cb4807e9a36f9eb01619c1a1e'/>
<id>33e92e9ecf48c08cb4807e9a36f9eb01619c1a1e</id>
<content type='text'>
The old comment justified the lockless READ_ONCE(file-&gt;f_ep) check
with "False positives simply cannot happen because the file is on
the way to be removed and nobody ( but eventpoll ) has still a
reference to this file." That reasoning was the root of the UAF
fixed in "eventpoll: fix ep_remove struct eventpoll / struct file
UAF": __ep_remove() could clear f_ep while another close raced
past the fast path and freed the watched eventpoll / recycled the
struct file slot.

With ep_remove() now pinning @file via epi_fget() across the f_ep
clear and hlist_del_rcu(), the invariant is re-established for the
right reason: anyone who might clear f_ep holds @file alive for
the duration, so a NULL observation really does mean no
concurrent eventpoll path has work left on this file. Refresh the
comment accordingly so the next reader doesn't inherit the broken
model.

Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-8-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The old comment justified the lockless READ_ONCE(file-&gt;f_ep) check
with "False positives simply cannot happen because the file is on
the way to be removed and nobody ( but eventpoll ) has still a
reference to this file." That reasoning was the root of the UAF
fixed in "eventpoll: fix ep_remove struct eventpoll / struct file
UAF": __ep_remove() could clear f_ep while another close raced
past the fast path and freed the watched eventpoll / recycled the
struct file slot.

With ep_remove() now pinning @file via epi_fget() across the f_ep
clear and hlist_del_rcu(), the invariant is re-established for the
right reason: anyone who might clear f_ep holds @file alive for
the duration, so a NULL observation really does mean no
concurrent eventpoll path has work left on this file. Refresh the
comment accordingly so the next reader doesn't inherit the broken
model.

Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-8-2470f9eec0f5@kernel.org
Signed-off-by: Christian Brauner (Amutable) &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventpoll: Convert epoll_put_uevent() to scoped user access</title>
<updated>2026-03-07T23:03:14+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-03-07T20:07:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1954c4f012206147c34acda8da04f827aa7d3ee3'/>
<id>1954c4f012206147c34acda8da04f827aa7d3ee3</id>
<content type='text'>
Saves two function calls, and one stac/clac pair.

stac/clac is rather expensive on older cpus like Zen 2.

A synthetic network stress test gives a ~1.5% increase of pps
on AMD Zen 2.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Saves two function calls, and one stac/clac pair.

stac/clac is rather expensive on older cpus like Zen 2.

A synthetic network stress test gives a ~1.5% increase of pps
on AMD Zen 2.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventpoll: add epoll_sendevents() helper</title>
<updated>2025-02-20T09:18:37+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-02-19T17:22:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ae3a4f1fdc2cfad089e79e2ee4697f84941528d3'/>
<id>ae3a4f1fdc2cfad089e79e2ee4697f84941528d3</id>
<content type='text'>
Basic helper that copies ready events to the specified userspace
address. The event checking is quick and racy, it's up to the caller
to ensure it retries appropriately in case 0 events are copied.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://lore.kernel.org/r/20250219172552.1565603-4-axboe@kernel.dk
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Basic helper that copies ready events to the specified userspace
address. The event checking is quick and racy, it's up to the caller
to ensure it retries appropriately in case 0 events are copied.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Link: https://lore.kernel.org/r/20250219172552.1565603-4-axboe@kernel.dk
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>epoll: annotate racy check</title>
<updated>2024-10-22T09:16:56+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2024-09-25T09:05:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6474353a5e3d0b2cf610153cea0c61f576a36d0a'/>
<id>6474353a5e3d0b2cf610153cea0c61f576a36d0a</id>
<content type='text'>
Epoll relies on a racy fastpath check during __fput() in
eventpoll_release() to avoid the hit of pointlessly acquiring a
semaphore. Annotate that race by using WRITE_ONCE() and READ_ONCE().

Link: https://lore.kernel.org/r/66edfb3c.050a0220.3195df.001a.GAE@google.com
Link: https://lore.kernel.org/r/20240925-fungieren-anbauen-79b334b00542@brauner
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reported-by: syzbot+3b6b32dc50537a49bb4a@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Epoll relies on a racy fastpath check during __fput() in
eventpoll_release() to avoid the hit of pointlessly acquiring a
semaphore. Annotate that race by using WRITE_ONCE() and READ_ONCE().

Link: https://lore.kernel.org/r/66edfb3c.050a0220.3195df.001a.GAE@google.com
Link: https://lore.kernel.org/r/20240925-fungieren-anbauen-79b334b00542@brauner
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Reported-by: syzbot+3b6b32dc50537a49bb4a@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ARM: 9108/1: oabi-compat: rework epoll_wait/epoll_pwait emulation</title>
<updated>2021-08-20T10:39:26+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2021-08-11T07:30:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=249dbe74d3c4b568a623fb55c56cddf19fdf0b89'/>
<id>249dbe74d3c4b568a623fb55c56cddf19fdf0b89</id>
<content type='text'>
The epoll_wait() system call wrapper is one of the remaining users of
the set_fs() infrasturcture for Arm. Changing it to not require set_fs()
is rather complex unfortunately.

The approach I'm taking here is to allow architectures to override
the code that copies the output to user space, and let the oabi-compat
implementation check whether it is getting called from an EABI or OABI
system call based on the thread_info-&gt;syscall value.

The in_oabi_syscall() check here mirrors the in_compat_syscall() and
in_x32_syscall() helpers for 32-bit compat implementations on other
architectures.

Overall, the amount of code goes down, at least with the newly added
sys_oabi_epoll_pwait() helper getting removed again. The downside
is added complexity in the source code for the native implementation.
There should be no difference in runtime performance except for Arm
kernels with CONFIG_OABI_COMPAT enabled that now have to go through
an external function call to check which of the two variants to use.

Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Russell King (Oracle) &lt;rmk+kernel@armlinux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The epoll_wait() system call wrapper is one of the remaining users of
the set_fs() infrasturcture for Arm. Changing it to not require set_fs()
is rather complex unfortunately.

The approach I'm taking here is to allow architectures to override
the code that copies the output to user space, and let the oabi-compat
implementation check whether it is getting called from an EABI or OABI
system call based on the thread_info-&gt;syscall value.

The in_oabi_syscall() check here mirrors the in_compat_syscall() and
in_x32_syscall() helpers for 32-bit compat implementations on other
architectures.

Overall, the amount of code goes down, at least with the newly added
sys_oabi_epoll_pwait() helper getting removed again. The downside
is added complexity in the source code for the native implementation.
There should be no difference in runtime performance except for Arm
kernels with CONFIG_OABI_COMPAT enabled that now have to go through
an external function call to check which of the two variants to use.

Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Russell King (Oracle) &lt;rmk+kernel@armlinux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE</title>
<updated>2021-02-16T08:59:41+00:00</updated>
<author>
<name>Chris Wilson</name>
<email>chris@chris-wilson.co.uk</email>
</author>
<published>2021-02-05T22:00:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bfe3911a91047557eb0e620f95a370aee6a248c7'/>
<id>bfe3911a91047557eb0e620f95a370aee6a248c7</id>
<content type='text'>
Userspace has discovered the functionality offered by SYS_kcmp and has
started to depend upon it. In particular, Mesa uses SYS_kcmp for
os_same_file_description() in order to identify when two fd (e.g. device
or dmabuf) point to the same struct file. Since they depend on it for
core functionality, lift SYS_kcmp out of the non-default
CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.

Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to
deduplicate the per-service file descriptor store.

Note that some distributions such as Ubuntu are already enabling
CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Will Drewry &lt;wad@chromium.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Dave Airlie &lt;airlied@gmail.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt; # DRM depends on kcmp
Acked-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt; # systemd uses kcmp
Reviewed-by: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Thomas Zimmermann &lt;tzimmermann@suse.de&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210205220012.1983-1-chris@chris-wilson.co.uk
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Userspace has discovered the functionality offered by SYS_kcmp and has
started to depend upon it. In particular, Mesa uses SYS_kcmp for
os_same_file_description() in order to identify when two fd (e.g. device
or dmabuf) point to the same struct file. Since they depend on it for
core functionality, lift SYS_kcmp out of the non-default
CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.

Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to
deduplicate the per-service file descriptor store.

Note that some distributions such as Ubuntu are already enabling
CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046
Signed-off-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Will Drewry &lt;wad@chromium.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Dave Airlie &lt;airlied@gmail.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Lucas Stach &lt;l.stach@pengutronix.de&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt; # DRM depends on kcmp
Acked-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt; # systemd uses kcmp
Reviewed-by: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Thomas Zimmermann &lt;tzimmermann@suse.de&gt;
Signed-off-by: Daniel Vetter &lt;daniel.vetter@ffwll.ch&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210205220012.1983-1-chris@chris-wilson.co.uk
</pre>
</div>
</content>
</entry>
</feed>
