<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/fuse, branch v5.1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge branch 'page-refs' (page ref overflow)</title>
<updated>2019-04-14T22:09:40+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-04-14T22:09:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6b3a707736301c2128ca85ce85fb13f60b5e350a'/>
<id>6b3a707736301c2128ca85ce85fb13f60b5e350a</id>
<content type='text'>
Merge page ref overflow branch.

Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).

Admittedly it's not exactly easy.  To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers.  Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).

Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication.  So let's just do that.

* branch page-refs:
  fs: prevent page refcount overflow in pipe_buf_get
  mm: prevent get_user_pages() from overflowing page refcount
  mm: add 'try_get_page()' helper function
  mm: make page ref count overflow check tighter and more explicit
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge page ref overflow branch.

Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).

Admittedly it's not exactly easy.  To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers.  Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).

Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication.  So let's just do that.

* branch page-refs:
  fs: prevent page refcount overflow in pipe_buf_get
  mm: prevent get_user_pages() from overflowing page refcount
  mm: add 'try_get_page()' helper function
  mm: make page ref count overflow check tighter and more explicit
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: prevent page refcount overflow in pipe_buf_get</title>
<updated>2019-04-14T17:00:04+00:00</updated>
<author>
<name>Matthew Wilcox</name>
<email>willy@infradead.org</email>
</author>
<published>2019-04-05T21:02:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=15fab63e1e57be9fdb5eec1bbc5916e9825e9acb'/>
<id>15fab63e1e57be9fdb5eec1bbc5916e9825e9acb</id>
<content type='text'>
Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount.  All
callers converted to handle a failure.

Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount.  All
callers converted to handle a failure.

Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse</title>
<updated>2019-03-12T21:46:26+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-12T21:46:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dfee9c257b102d7c0407629eef2ed32e152de0d2'/>
<id>dfee9c257b102d7c0407629eef2ed32e152de0d2</id>
<content type='text'>
Pull fuse updates from Miklos Szeredi:
 "Scalability and performance improvements, as well as minor bug fixes
  and cleanups"

* tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits)
  fuse: cache readdir calls if filesystem opts out of opendir
  fuse: support clients that don't implement 'opendir'
  fuse: lift bad inode checks into callers
  fuse: multiplex cached/direct_io file operations
  fuse add copy_file_range to direct io fops
  fuse: use iov_iter based generic splice helpers
  fuse: Switch to using async direct IO for FOPEN_DIRECT_IO
  fuse: use atomic64_t for khctr
  fuse: clean up aborted
  fuse: Protect ff-&gt;reserved_req via corresponding fi-&gt;lock
  fuse: Protect fi-&gt;nlookup with fi-&gt;lock
  fuse: Introduce fi-&gt;lock to protect write related fields
  fuse: Convert fc-&gt;attr_version into atomic64_t
  fuse: Add fuse_inode argument to fuse_prepare_release()
  fuse: Verify userspace asks to requeue interrupt that we really sent
  fuse: Do some refactoring in fuse_dev_do_write()
  fuse: Wake up req-&gt;waitq of only if not background
  fuse: Optimize request_end() by not taking fiq-&gt;waitq.lock
  fuse: Kill fasync only if interrupt is queued in queue_interrupt()
  fuse: Remove stale comment in end_requests()
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull fuse updates from Miklos Szeredi:
 "Scalability and performance improvements, as well as minor bug fixes
  and cleanups"

* tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits)
  fuse: cache readdir calls if filesystem opts out of opendir
  fuse: support clients that don't implement 'opendir'
  fuse: lift bad inode checks into callers
  fuse: multiplex cached/direct_io file operations
  fuse add copy_file_range to direct io fops
  fuse: use iov_iter based generic splice helpers
  fuse: Switch to using async direct IO for FOPEN_DIRECT_IO
  fuse: use atomic64_t for khctr
  fuse: clean up aborted
  fuse: Protect ff-&gt;reserved_req via corresponding fi-&gt;lock
  fuse: Protect fi-&gt;nlookup with fi-&gt;lock
  fuse: Introduce fi-&gt;lock to protect write related fields
  fuse: Convert fc-&gt;attr_version into atomic64_t
  fuse: Add fuse_inode argument to fuse_prepare_release()
  fuse: Verify userspace asks to requeue interrupt that we really sent
  fuse: Do some refactoring in fuse_dev_do_write()
  fuse: Wake up req-&gt;waitq of only if not background
  fuse: Optimize request_end() by not taking fiq-&gt;waitq.lock
  fuse: Kill fasync only if interrupt is queued in queue_interrupt()
  fuse: Remove stale comment in end_requests()
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: refactor readahead defines in mm.h</title>
<updated>2019-03-12T17:04:01+00:00</updated>
<author>
<name>Nikolay Borisov</name>
<email>nborisov@suse.com</email>
</author>
<published>2019-03-12T06:28:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b5420237ec817b0b5f729a674c81ace0865c3b3b'/>
<id>b5420237ec817b0b5f729a674c81ace0865c3b3b</id>
<content type='text'>
All users of VM_MAX_READAHEAD actually convert it to kbytes and then to
pages. Define the macro explicitly as (SZ_128K / PAGE_SIZE). This
simplifies the expression in every filesystem. Also rename the macro to
VM_READAHEAD_PAGES to properly convey its meaning. Finally remove unused
VM_MIN_READAHEAD

[akpm@linux-foundation.org: fix fs/io_uring.c, per Stephen]
Link: http://lkml.kernel.org/r/20181221144053.24318-1-nborisov@suse.com
Signed-off-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Eric Van Hensbergen &lt;ericvh@gmail.com&gt;
Cc: Latchesar Ionkov &lt;lucho@ionkov.net&gt;
Cc: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All users of VM_MAX_READAHEAD actually convert it to kbytes and then to
pages. Define the macro explicitly as (SZ_128K / PAGE_SIZE). This
simplifies the expression in every filesystem. Also rename the macro to
VM_READAHEAD_PAGES to properly convey its meaning. Finally remove unused
VM_MIN_READAHEAD

[akpm@linux-foundation.org: fix fs/io_uring.c, per Stephen]
Link: http://lkml.kernel.org/r/20181221144053.24318-1-nborisov@suse.com
Signed-off-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Eric Van Hensbergen &lt;ericvh@gmail.com&gt;
Cc: Latchesar Ionkov &lt;lucho@ionkov.net&gt;
Cc: Dominique Martinet &lt;asmadeus@codewreck.org&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: David Sterba &lt;dsterba@suse.com&gt;
Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: cache readdir calls if filesystem opts out of opendir</title>
<updated>2019-02-13T12:15:15+00:00</updated>
<author>
<name>Chad Austin</name>
<email>chadaustin@fb.com</email>
</author>
<published>2019-01-29T00:34:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fabf7e0262d0bd57739d29aeac94c44b0542ff1f'/>
<id>fabf7e0262d0bd57739d29aeac94c44b0542ff1f</id>
<content type='text'>
If a filesystem returns ENOSYS from opendir and thus opts out of
opendir and releasedir requests, it almost certainly would also like
readdir results cached. Default open_flags to FOPEN_KEEP_CACHE and
FOPEN_CACHE_DIR in that case.

With this patch, I've measured recursive directory enumeration across
large FUSE mounts to be faster than native mounts.

Signed-off-by: Chad Austin &lt;chadaustin@fb.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a filesystem returns ENOSYS from opendir and thus opts out of
opendir and releasedir requests, it almost certainly would also like
readdir results cached. Default open_flags to FOPEN_KEEP_CACHE and
FOPEN_CACHE_DIR in that case.

With this patch, I've measured recursive directory enumeration across
large FUSE mounts to be faster than native mounts.

Signed-off-by: Chad Austin &lt;chadaustin@fb.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: support clients that don't implement 'opendir'</title>
<updated>2019-02-13T12:15:15+00:00</updated>
<author>
<name>Chad Austin</name>
<email>chadaustin@fb.com</email>
</author>
<published>2019-01-08T00:53:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d9a9ea94f748f47b1d75c6c5e33edcf74476c445'/>
<id>d9a9ea94f748f47b1d75c6c5e33edcf74476c445</id>
<content type='text'>
Allow filesystems to return ENOSYS from opendir, preventing the kernel from
sending opendir and releasedir messages in the future. This avoids
userspace transitions when filesystems don't need to keep track of state
per directory handle.

A new capability flag, FUSE_NO_OPENDIR_SUPPORT, parallels
FUSE_NO_OPEN_SUPPORT, indicating the new semantics for returning ENOSYS
from opendir.

Signed-off-by: Chad Austin &lt;chadaustin@fb.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Allow filesystems to return ENOSYS from opendir, preventing the kernel from
sending opendir and releasedir messages in the future. This avoids
userspace transitions when filesystems don't need to keep track of state
per directory handle.

A new capability flag, FUSE_NO_OPENDIR_SUPPORT, parallels
FUSE_NO_OPEN_SUPPORT, indicating the new semantics for returning ENOSYS
from opendir.

Signed-off-by: Chad Austin &lt;chadaustin@fb.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: lift bad inode checks into callers</title>
<updated>2019-02-13T12:15:15+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2019-01-24T09:40:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2f7b6f5bed01a3fd2abcc20d2c85b7c532eb95cd'/>
<id>2f7b6f5bed01a3fd2abcc20d2c85b7c532eb95cd</id>
<content type='text'>
Bad inode checks were done  done in various places, and move them into
fuse_file_{read|write}_iter().

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Bad inode checks were done  done in various places, and move them into
fuse_file_{read|write}_iter().

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: multiplex cached/direct_io file operations</title>
<updated>2019-02-13T12:15:15+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2019-01-24T09:40:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=55752a3aba1387887afa024a0732f8ae52fb0645'/>
<id>55752a3aba1387887afa024a0732f8ae52fb0645</id>
<content type='text'>
This is cleanup, as well as allowing switching between I/O modes while the
file is open in the future.

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is cleanup, as well as allowing switching between I/O modes while the
file is open in the future.

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse add copy_file_range to direct io fops</title>
<updated>2019-02-13T12:15:14+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2019-01-24T09:40:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d4136d60751a5f45f47f1c3a77f6e8bafa11be1f'/>
<id>d4136d60751a5f45f47f1c3a77f6e8bafa11be1f</id>
<content type='text'>
Nothing preventing copy_file_range to work on files opened with
FOPEN_DIRECT_IO.

Fixes: 88bc7d5097a1 ("fuse: add support for copy_file_range()")
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Nothing preventing copy_file_range to work on files opened with
FOPEN_DIRECT_IO.

Fixes: 88bc7d5097a1 ("fuse: add support for copy_file_range()")
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: use iov_iter based generic splice helpers</title>
<updated>2019-02-13T12:15:14+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2019-01-24T09:40:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3c3db095b68c5df901d837a01a69dcd2693f85f6'/>
<id>3c3db095b68c5df901d837a01a69dcd2693f85f6</id>
<content type='text'>
The default splice implementation is grossly inefficient and the iter based
ones work just fine, so use those instead.  I've measured an 8x speedup for
splice write (with len = 128k).

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The default splice implementation is grossly inefficient and the iter based
ones work just fine, so use those instead.  I've measured an 8x speedup for
splice write (with len = 128k).

Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
