linux.git/fs/inode.c, branch v4.5

writeback: initialize inode members that track writeback history

2016-02-16T21:57:21+00:00

inode struct members that track cgroup writeback information
should be reinitialized when inode gets allocated from
kmem_cache. Otherwise, their values remain and get used by the
new inode.

Signed-off-by: Tahsin Erdogan 
Acked-by: Tejun Heo 
Fixes: d10c80955265 ("writeback: implement foreign cgroup inode bdi_writeback switching")
Signed-off-by: Jens Axboe

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

2016-01-23T20:24:56+00:00

Pull final vfs updates from Al Viro:

 - The ->i_mutex wrappers (with small prereq in lustre)

 - a fix for too early freeing of symlink bodies on shmem (they need to
   be RCU-delayed) (-stable fodder)

 - followup to dedupe stuff merged this cycle

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: abort dedupe loop if fatal signals are pending
  make sure that freeing shmem fast symlinks is RCU-delayed
  wrappers for ->i_mutex access
  lustre: remove unused declaration

dax: support dirty DAX entries in radix tree

2016-01-23T01:02:18+00:00

Add support for tracking dirty DAX entries in the struct address_space
radix tree.  This tree is already used for dirty page writeback, and it
already supports the use of exceptional (non struct page*) entries.

In order to properly track dirty DAX pages we will insert new
exceptional entries into the radix tree that represent dirty DAX PTE or
PMD pages.  These exceptional entries will also contain the writeback
addresses for the PTE or PMD faults that we can use at fsync/msync time.

There are currently two types of exceptional entries (shmem and shadow)
that can be placed into the radix tree, and this adds a third.  We rely
on the fact that only one type of exceptional entry can be found in a
given radix tree based on its usage.  This happens for free with DAX vs
shmem but we explicitly prevent shadow entries from being added to radix
trees for DAX mappings.

The only shadow entries that would be generated for DAX radix trees
would be to track zero page mappings that were created for holes.  These
pages would receive minimal benefit from having shadow entries, and the
choice to have only one type of exceptional entry in a given radix tree
makes the logic simpler both in clear_exceptional_entry() and in the
rest of DAX.

Signed-off-by: Ross Zwisler 
Cc: "H. Peter Anvin" 
Cc: "J. Bruce Fields" 
Cc: "Theodore Ts'o" 
Cc: Alexander Viro 
Cc: Andreas Dilger 
Cc: Dave Chinner 
Cc: Ingo Molnar 
Cc: Jan Kara 
Cc: Jeff Layton 
Cc: Matthew Wilcox 
Cc: Thomas Gleixner 
Cc: Dan Williams 
Cc: Matthew Wilcox 
Cc: Dave Hansen 
Cc: Hugh Dickins 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

wrappers for ->i_mutex access

2016-01-22T23:04:28+00:00

parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro

kmemcg: account certain kmem allocations to memcg

2016-01-15T00:00:49+00:00

Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov 
Acked-by: Johannes Weiner 
Acked-by: Michal Hocko 
Cc: Tejun Heo 
Cc: Greg Thelen 
Cc: Christoph Lameter 
Cc: Pekka Enberg 
Cc: David Rientjes 
Cc: Joonsoo Kim 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

Merge tag 'locks-v4.5-1' of git://git.samba.org/jlayton/linux

2016-01-12T23:46:17+00:00

Pull file locking updates from Jeff Layton:
 "File locking related changes for v4.5 (pile #1)

  Highlights:
   - new Kconfig option to allow disabling mandatory locking (which is
     racy anyway)
   - new tracepoints for setlk and close codepaths
   - fix for a long-standing bug in code that handles races between
     setting a POSIX lock and close()"

* tag 'locks-v4.5-1' of git://git.samba.org/jlayton/linux:
  locks: rename __posix_lock_file to posix_lock_inode
  locks: prink more detail when there are leaked locks
  locks: pass inode pointer to locks_free_lock_context
  locks: sprinkle some tracepoints around the file locking code
  locks: don't check for race with close when setting OFD lock
  locks: fix unlock when fcntl_setlk races with a close
  fs: make locks.c explicitly non-modular
  locks: use list_first_entry_or_null()
  locks: Don't allow mounts in user namespaces to enable mandatory locking
  locks: Allow disabling mandatory locking at compile time

locks: pass inode pointer to locks_free_lock_context

2016-01-08T16:38:19+00:00

...so we can print information about it if there are leaked locks.

Signed-off-by: Jeff Layton 
Acked-by: "J. Bruce Fields"

don't put symlink bodies in pagecache into highmem

2015-12-09T03:41:36+00:00

kmap() in page_follow_link_light() needed to go - allowing to hold
an arbitrary number of kmaps for long is a great way to deadlocking
the system.

new helper (inode_nohighmem(inode)) needs to be used for pagecache
symlinks inodes; done for all in-tree cases.  page_follow_link_light()
instrumented to yell about anything missed.

Signed-off-by: Al Viro

fs/inode.c: fix kernel-doc warning

2015-11-09T23:11:24+00:00

Fix kernel-doc warning in fs/inode.c:

  ../fs/inode.c:1606: warning: No description found for parameter 'inode'

Signed-off-by: Randy Dunlap 
Cc: Al Viro 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

inode: don't softlockup when evicting inodes

2015-08-18T17:20:09+00:00

On a box with a lot of ram (148gb) I can make the box softlockup after running
an fs_mark job that creates hundreds of millions of empty files.  This is
because we never generate enough memory pressure to keep the number of inodes on
our unused list low, so when we go to unmount we have to evict ~100 million
inodes.  This makes one processor a very unhappy person, so add a cond_resched()
in dispose_list() and if we need a resched when processing the s_inodes list do
that and run dispose_list() on what we've currently culled.  Thanks,

Signed-off-by: Josef Bacik 
Reviewed-by: Jan Kara