<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/eventfd.c, branch linux-5.4.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>eventfd: prevent underflow for eventfd semaphores</title>
<updated>2023-09-23T08:59:40+00:00</updated>
<author>
<name>Wen Yang</name>
<email>wenyang.linux@foxmail.com</email>
</author>
<published>2023-07-09T06:54:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0a2b1eb8a9ce53e3f16064c63f19e8612707109f'/>
<id>0a2b1eb8a9ce53e3f16064c63f19e8612707109f</id>
<content type='text'>
[ Upstream commit 758b492047816a3158d027e9fca660bc5bcf20bf ]

For eventfd with flag EFD_SEMAPHORE, when its ctx-&gt;count is 0, calling
eventfd_ctx_do_read will cause ctx-&gt;count to overflow to ULLONG_MAX.

An underflow can happen with EFD_SEMAPHORE eventfds in at least the
following three subsystems:

(1) virt/kvm/eventfd.c
(2) drivers/vfio/virqfd.c
(3) drivers/virt/acrn/irqfd.c

where (2) and (3) are just modeled after (1). An eventfd must be
specified for use with the KVM_IRQFD ioctl(). This can also be an
EFD_SEMAPHORE eventfd. When the eventfd count is zero or has been
decremented to zero an underflow can be triggered when the irqfd is shut
down by raising the KVM_IRQFD_FLAG_DEASSIGN flag in the KVM_IRQFD
ioctl():

        // ctx-&gt;count == 0
        kvm_vm_ioctl()
        -&gt; kvm_irqfd()
           -&gt; kvm_irqfd_deassign()
              -&gt; irqfd_deactivate()
                 -&gt; irqfd_shutdown()
                    -&gt; eventfd_ctx_remove_wait_queue(&amp;cnt)
                       -&gt; eventfd_ctx_do_read(&amp;cnt)

Userspace polling on the eventfd wouldn't notice the underflow because 1
is always returned as the value from eventfd_read() while ctx-&gt;count
would've underflowed. It's not a huge deal because this should only be
happening when the irqfd is shutdown but we should still fix it and
avoid the spurious wakeup.

Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
Signed-off-by: Wen Yang &lt;wenyang.linux@foxmail.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Dylan Yudaken &lt;dylany@fb.com&gt;
Cc: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: &lt;tencent_7588DFD1F365950A757310D764517A14B306@qq.com&gt;
[brauner: rewrite commit message and add explanation how this underflow can happen]
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 758b492047816a3158d027e9fca660bc5bcf20bf ]

For eventfd with flag EFD_SEMAPHORE, when its ctx-&gt;count is 0, calling
eventfd_ctx_do_read will cause ctx-&gt;count to overflow to ULLONG_MAX.

An underflow can happen with EFD_SEMAPHORE eventfds in at least the
following three subsystems:

(1) virt/kvm/eventfd.c
(2) drivers/vfio/virqfd.c
(3) drivers/virt/acrn/irqfd.c

where (2) and (3) are just modeled after (1). An eventfd must be
specified for use with the KVM_IRQFD ioctl(). This can also be an
EFD_SEMAPHORE eventfd. When the eventfd count is zero or has been
decremented to zero an underflow can be triggered when the irqfd is shut
down by raising the KVM_IRQFD_FLAG_DEASSIGN flag in the KVM_IRQFD
ioctl():

        // ctx-&gt;count == 0
        kvm_vm_ioctl()
        -&gt; kvm_irqfd()
           -&gt; kvm_irqfd_deassign()
              -&gt; irqfd_deactivate()
                 -&gt; irqfd_shutdown()
                    -&gt; eventfd_ctx_remove_wait_queue(&amp;cnt)
                       -&gt; eventfd_ctx_do_read(&amp;cnt)

Userspace polling on the eventfd wouldn't notice the underflow because 1
is always returned as the value from eventfd_read() while ctx-&gt;count
would've underflowed. It's not a huge deal because this should only be
happening when the irqfd is shutdown but we should still fix it and
avoid the spurious wakeup.

Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove")
Signed-off-by: Wen Yang &lt;wenyang.linux@foxmail.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Dylan Yudaken &lt;dylany@fb.com&gt;
Cc: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Message-Id: &lt;tencent_7588DFD1F365950A757310D764517A14B306@qq.com&gt;
[brauner: rewrite commit message and add explanation how this underflow can happen]
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventfd: Export eventfd_ctx_do_read()</title>
<updated>2023-09-23T08:59:40+00:00</updated>
<author>
<name>David Woodhouse</name>
<email>dwmw@amazon.co.uk</email>
</author>
<published>2020-10-27T13:55:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3e9617d63edf5fdd1d1bf5746dc594184142fc8f'/>
<id>3e9617d63edf5fdd1d1bf5746dc594184142fc8f</id>
<content type='text'>
[ Upstream commit 28f1326710555bbe666f64452d08f2d7dd657cae ]

Where events are consumed in the kernel, for example by KVM's
irqfd_wakeup() and VFIO's virqfd_wakeup(), they currently lack a
mechanism to drain the eventfd's counter.

Since the wait queue is already locked while the wakeup functions are
invoked, all they really need to do is call eventfd_ctx_do_read().

Add a check for the lock, and export it for them.

Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Message-Id: &lt;20201027135523.646811-2-dwmw2@infradead.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Stable-dep-of: 758b49204781 ("eventfd: prevent underflow for eventfd semaphores")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 28f1326710555bbe666f64452d08f2d7dd657cae ]

Where events are consumed in the kernel, for example by KVM's
irqfd_wakeup() and VFIO's virqfd_wakeup(), they currently lack a
mechanism to drain the eventfd's counter.

Since the wait queue is already locked while the wakeup functions are
invoked, all they really need to do is call eventfd_ctx_do_read().

Add a check for the lock, and export it for them.

Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Message-Id: &lt;20201027135523.646811-2-dwmw2@infradead.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Stable-dep-of: 758b49204781 ("eventfd: prevent underflow for eventfd semaphores")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventfd: track eventfd_signal() recursion depth</title>
<updated>2020-02-11T12:35:37+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2020-02-02T15:23:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=844d2025b68d9f493129b0a9ec0738ebbfe6e847'/>
<id>844d2025b68d9f493129b0a9ec0738ebbfe6e847</id>
<content type='text'>
commit b5e683d5cab8cd433b06ae178621f083cabd4f63 upstream.

eventfd use cases from aio and io_uring can deadlock due to circular
or resursive calling, when eventfd_signal() tries to grab the waitqueue
lock. On top of that, it's also possible to construct notification
chains that are deep enough that we could blow the stack.

Add a percpu counter that tracks the percpu recursion depth, warn if we
exceed it. The counter is also exposed so that users of eventfd_signal()
can do the right thing if it's non-zero in the context where it is
called.

Cc: stable@vger.kernel.org # 4.19+
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b5e683d5cab8cd433b06ae178621f083cabd4f63 upstream.

eventfd use cases from aio and io_uring can deadlock due to circular
or resursive calling, when eventfd_signal() tries to grab the waitqueue
lock. On top of that, it's also possible to construct notification
chains that are deep enough that we could blow the stack.

Add a percpu counter that tracks the percpu recursion depth, warn if we
exceed it. The counter is also exposed so that users of eventfd_signal()
can do the right thing if it's non-zero in the context where it is
called.

Cc: stable@vger.kernel.org # 4.19+
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>treewide: Add SPDX license identifier for missed files</title>
<updated>2019-05-21T08:50:45+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-19T12:08:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=457c89965399115e5cd8bf38f9c597293405703d'/>
<id>457c89965399115e5cd8bf38f9c597293405703d</id>
<content type='text'>
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/eventfd.c: make eventfd_ida static</title>
<updated>2019-05-15T02:52:51+00:00</updated>
<author>
<name>YueHaibing</name>
<email>yuehaibing@huawei.com</email>
</author>
<published>2019-05-14T22:45:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ce528c4c20f94b48781a9bb9e32435d2b9782995'/>
<id>ce528c4c20f94b48781a9bb9e32435d2b9782995</id>
<content type='text'>
Fix sparse warning:

fs/eventfd.c:26:1: warning:
 symbol 'eventfd_ida' was not declared. Should it be static?

Link: http://lkml.kernel.org/r/20190413142348.34716-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix sparse warning:

fs/eventfd.c:26:1: warning:
 symbol 'eventfd_ida' was not declared. Should it be static?

Link: http://lkml.kernel.org/r/20190413142348.34716-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventfd: present id to userspace via fdinfo</title>
<updated>2019-05-15T02:52:51+00:00</updated>
<author>
<name>Masatake YAMATO</name>
<email>yamato@redhat.com</email>
</author>
<published>2019-05-14T22:45:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b556db17b0e7c439bb6113b6dc7185bd0b1bbbb4'/>
<id>b556db17b0e7c439bb6113b6dc7185bd0b1bbbb4</id>
<content type='text'>
Finding endpoints of an IPC channel is one of essential task to
understand how a user program works.  Procfs and netlink socket provide
enough hints to find endpoints for IPC channels like pipes, unix
sockets, and pseudo terminals.  However, there is no simple way to find
endpoints for an eventfd file from userland.  An inode number doesn't
hint.  Unlike pipe, all eventfd files share the same inode object.

To provide the way to find endpoints of an eventfd file, this patch adds
"eventfd-id" field to /proc/PID/fdinfo of eventfd as identifier.
Integers managed by an IDA are used as ids.

A tool like lsof can utilize the information to print endpoints.

Link: http://lkml.kernel.org/r/20190327181823.20222-1-yamato@redhat.com
Signed-off-by: Masatake YAMATO &lt;yamato@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Finding endpoints of an IPC channel is one of essential task to
understand how a user program works.  Procfs and netlink socket provide
enough hints to find endpoints for IPC channels like pipes, unix
sockets, and pseudo terminals.  However, there is no simple way to find
endpoints for an eventfd file from userland.  An inode number doesn't
hint.  Unlike pipe, all eventfd files share the same inode object.

To provide the way to find endpoints of an eventfd file, this patch adds
"eventfd-id" field to /proc/PID/fdinfo of eventfd as identifier.
Integers managed by an IDA are used as ids.

A tool like lsof can utilize the information to print endpoints.

Link: http://lkml.kernel.org/r/20190327181823.20222-1-yamato@redhat.com
Signed-off-by: Masatake YAMATO &lt;yamato@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert changes to convert to -&gt;poll_mask() and aio IOCB_CMD_POLL</title>
<updated>2018-06-28T17:40:47+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-06-28T16:43:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a11e1d432b51f63ba698d044441284a661f01144'/>
<id>a11e1d432b51f63ba698d044441284a661f01144</id>
<content type='text'>
The poll() changes were not well thought out, and completely
unexplained.  They also caused a huge performance regression, because
"-&gt;poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.

Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"-&gt;get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead.  That gets rid of one of the new indirections.

But that doesn't fix the new complexity that is completely unwarranted
for the regular case.  The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.

[ This revert is a revert of about 30 different commits, not reverted
  individually because that would just be unnecessarily messy  - Linus ]

Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The poll() changes were not well thought out, and completely
unexplained.  They also caused a huge performance regression, because
"-&gt;poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.

Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"-&gt;get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead.  That gets rid of one of the new indirections.

But that doesn't fix the new complexity that is completely unwarranted
for the regular case.  The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.

[ This revert is a revert of about 30 different commits, not reverted
  individually because that would just be unnecessarily messy  - Linus ]

Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventfd: only return events requested in poll_mask()</title>
<updated>2018-06-15T00:07:38+00:00</updated>
<author>
<name>Avi Kivity</name>
<email>avi@scylladb.com</email>
</author>
<published>2018-06-08T19:12:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4d572d9f46507be8cfe326aa5bc3698babcbdfa7'/>
<id>4d572d9f46507be8cfe326aa5bc3698babcbdfa7</id>
<content type='text'>
The -&gt;poll_mask() operation has a mask of events that the caller
is interested in, but we're returning all events regardless.

Change to return only the events the caller is interested in. This
fixes aio IO_CMD_POLL returning immediately when called with POLLIN
on an eventfd, since an eventfd is almost always ready for a write.

Signed-off-by: Avi Kivity &lt;avi@scylladb.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The -&gt;poll_mask() operation has a mask of events that the caller
is interested in, but we're returning all events regardless.

Change to return only the events the caller is interested in. This
fixes aio IO_CMD_POLL returning immediately when called with POLLIN
on an eventfd, since an eventfd is almost always ready for a write.

Signed-off-by: Avi Kivity &lt;avi@scylladb.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>eventfd: switch to -&gt;poll_mask</title>
<updated>2018-05-26T07:16:44+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-12-31T15:42:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9e42f195f5b5cae4cf8ec71432c0fdaf641d1b73'/>
<id>9e42f195f5b5cae4cf8ec71432c0fdaf641d1b73</id>
<content type='text'>
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: add do_eventfd() helper; remove internal call to sys_eventfd()</title>
<updated>2018-04-02T18:15:39+00:00</updated>
<author>
<name>Dominik Brodowski</name>
<email>linux@dominikbrodowski.net</email>
</author>
<published>2018-03-11T10:34:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2fc96f8331ba7d08ddc126d154a1504b9a79b79e'/>
<id>2fc96f8331ba7d08ddc126d154a1504b9a79b79e</id>
<content type='text'>
Using this helper removes an in-kernel call to the sys_eventfd() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using this helper removes an in-kernel call to the sys_eventfd() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
