linux.git/fs/fuse, branch v5.1

Merge branch 'page-refs' (page ref overflow)

2019-04-14T22:09:40+00:00

Merge page ref overflow branch.

Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).

Admittedly it's not exactly easy.  To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers.  Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).

Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication.  So let's just do that.

* branch page-refs:
  fs: prevent page refcount overflow in pipe_buf_get
  mm: prevent get_user_pages() from overflowing page refcount
  mm: add 'try_get_page()' helper function
  mm: make page ref count overflow check tighter and more explicit

fs: prevent page refcount overflow in pipe_buf_get

2019-04-14T17:00:04+00:00

Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount.  All
callers converted to handle a failure.

Reported-by: Jann Horn 
Signed-off-by: Matthew Wilcox 
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Merge tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

2019-03-12T21:46:26+00:00

Pull fuse updates from Miklos Szeredi:
 "Scalability and performance improvements, as well as minor bug fixes
  and cleanups"

* tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits)
  fuse: cache readdir calls if filesystem opts out of opendir
  fuse: support clients that don't implement 'opendir'
  fuse: lift bad inode checks into callers
  fuse: multiplex cached/direct_io file operations
  fuse add copy_file_range to direct io fops
  fuse: use iov_iter based generic splice helpers
  fuse: Switch to using async direct IO for FOPEN_DIRECT_IO
  fuse: use atomic64_t for khctr
  fuse: clean up aborted
  fuse: Protect ff->reserved_req via corresponding fi->lock
  fuse: Protect fi->nlookup with fi->lock
  fuse: Introduce fi->lock to protect write related fields
  fuse: Convert fc->attr_version into atomic64_t
  fuse: Add fuse_inode argument to fuse_prepare_release()
  fuse: Verify userspace asks to requeue interrupt that we really sent
  fuse: Do some refactoring in fuse_dev_do_write()
  fuse: Wake up req->waitq of only if not background
  fuse: Optimize request_end() by not taking fiq->waitq.lock
  fuse: Kill fasync only if interrupt is queued in queue_interrupt()
  fuse: Remove stale comment in end_requests()
  ...

mm: refactor readahead defines in mm.h

2019-03-12T17:04:01+00:00

All users of VM_MAX_READAHEAD actually convert it to kbytes and then to
pages. Define the macro explicitly as (SZ_128K / PAGE_SIZE). This
simplifies the expression in every filesystem. Also rename the macro to
VM_READAHEAD_PAGES to properly convey its meaning. Finally remove unused
VM_MIN_READAHEAD

[akpm@linux-foundation.org: fix fs/io_uring.c, per Stephen]
Link: http://lkml.kernel.org/r/20181221144053.24318-1-nborisov@suse.com
Signed-off-by: Nikolay Borisov 
Reviewed-by: Matthew Wilcox 
Reviewed-by: David Hildenbrand 
Cc: Jens Axboe 
Cc: Eric Van Hensbergen 
Cc: Latchesar Ionkov 
Cc: Dominique Martinet 
Cc: David Howells 
Cc: Chris Mason 
Cc: Josef Bacik 
Cc: David Sterba 
Cc: Miklos Szeredi 
Cc: Stephen Rothwell 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

fuse: cache readdir calls if filesystem opts out of opendir

2019-02-13T12:15:15+00:00

If a filesystem returns ENOSYS from opendir and thus opts out of
opendir and releasedir requests, it almost certainly would also like
readdir results cached. Default open_flags to FOPEN_KEEP_CACHE and
FOPEN_CACHE_DIR in that case.

With this patch, I've measured recursive directory enumeration across
large FUSE mounts to be faster than native mounts.

Signed-off-by: Chad Austin 
Signed-off-by: Miklos Szeredi

fuse: support clients that don't implement 'opendir'

2019-02-13T12:15:15+00:00

Allow filesystems to return ENOSYS from opendir, preventing the kernel from
sending opendir and releasedir messages in the future. This avoids
userspace transitions when filesystems don't need to keep track of state
per directory handle.

A new capability flag, FUSE_NO_OPENDIR_SUPPORT, parallels
FUSE_NO_OPEN_SUPPORT, indicating the new semantics for returning ENOSYS
from opendir.

Signed-off-by: Chad Austin 
Signed-off-by: Miklos Szeredi

fuse: lift bad inode checks into callers

2019-02-13T12:15:15+00:00

Bad inode checks were done  done in various places, and move them into
fuse_file_{read|write}_iter().

Signed-off-by: Miklos Szeredi

fuse: multiplex cached/direct_io file operations

2019-02-13T12:15:15+00:00

This is cleanup, as well as allowing switching between I/O modes while the
file is open in the future.

Signed-off-by: Miklos Szeredi

fuse add copy_file_range to direct io fops

2019-02-13T12:15:14+00:00

Nothing preventing copy_file_range to work on files opened with
FOPEN_DIRECT_IO.

Fixes: 88bc7d5097a1 ("fuse: add support for copy_file_range()")
Signed-off-by: Miklos Szeredi

fuse: use iov_iter based generic splice helpers

2019-02-13T12:15:14+00:00

The default splice implementation is grossly inefficient and the iter based
ones work just fine, so use those instead.  I've measured an 8x speedup for
splice write (with len = 128k).

Signed-off-by: Miklos Szeredi