linux-stable.git/fs/inode.c, branch v4.14.331

fs: fix UAF/GPF bug in nilfs_mdt_destroy

2022-10-26T11:16:51+00:00

commit 2e488f13755ffbb60f307e991b27024716a33b29 upstream.

In alloc_inode, inode_init_always() could return -ENOMEM if
security_inode_alloc() fails, which causes inode->i_private
uninitialized. Then nilfs_is_metadata_file_inode() returns
true and nilfs_free_inode() wrongly calls nilfs_mdt_destroy(),
which frees the uninitialized inode->i_private
and leads to crashes(e.g., UAF/GPF).

Fix this by moving security_inode_alloc just prior to
this_cpu_inc(nr_inodes)

Link: https://lkml.kernel.org/r/CAFcO6XOcf1Jj2SeGt=jJV59wmhESeSKpfR0omdFRq+J9nD1vfQ@mail.gmail.com
Reported-by: butt3rflyh4ck 
Reported-by: Hao Sun 
Reported-by: Jiacheng Xu 
Reviewed-by: Christian Brauner (Microsoft) 
Signed-off-by: Dongliang Mu 
Cc: Al Viro 
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro 
Signed-off-by: Greg Kroah-Hartman

futex: Fix inode life-time issue

2020-04-02T14:34:21+00:00

commit 8019ad13ef7f64be44d4f892af9c840179009254 upstream.

As reported by Jann, ihold() does not in fact guarantee inode
persistence. And instead of making it so, replace the usage of inode
pointers with a per boot, machine wide, unique inode identifier.

This sequence number is global, but shared (file backed) futexes are
rare enough that this should not become a performance issue.

Reported-by: Jann Horn 
Suggested-by: Linus Torvalds 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Greg Kroah-Hartman

fs: avoid softlockups in s_inodes iterators

2020-01-12T11:11:59+00:00

[ Upstream commit 04646aebd30b99f2cfa0182435a2ec252fcb16d0 ]

Anything that walks all inodes on sb->s_inodes list without rescheduling
risks softlockups.

Previous efforts were made in 2 functions, see:

c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
ac05fbb inode: don't softlockup when evicting inodes

but there hasn't been an audit of all walkers, so do that now.  This
also consistently moves the cond_resched() calls to the bottom of each
loop in cases where it already exists.

One loop remains: remove_dquot_ref(), because I'm not quite sure how
to deal with that one w/o taking the i_lock.

Signed-off-by: Eric Sandeen 
Reviewed-by: Jan Kara 
Signed-off-by: Al Viro 
Signed-off-by: Sasha Levin

Abort file_remove_privs() for non-reg. files

2019-06-22T06:16:19+00:00

commit f69e749a49353d96af1a293f56b5b56de59c668a upstream.

file_remove_privs() might be called for non-regular files, e.g.
blkdev inode. There is no reason to do its job on things
like blkdev inodes, pipes, or cdevs. Hence, abort if
file does not refer to a regular inode.

AV: more to the point, for devices there might be any number of
inodes refering to given device.  Which one to strip the permissions
from, even if that made any sense in the first place?  All of them
will be observed with contents modified, after all.

Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
Spinczyk)

Reviewed-by: Jan Kara 
Signed-off-by: Alexander Lochmann 
Signed-off-by: Horst Schirmeier 
Signed-off-by: Al Viro 
Cc: Zubin Mithra 
Signed-off-by: Greg Kroah-Hartman

Fix up non-directory creation in SGID directories

2018-07-17T09:39:27+00:00

commit 0fa3ecd87848c9c93c2c828ef4c3a8ca36ce46c7 upstream.

sgid directories have special semantics, making newly created files in
the directory belong to the group of the directory, and newly created
subdirectories will also become sgid.  This is historically used for
group-shared directories.

But group directories writable by non-group members should not imply
that such non-group members can magically join the group, so make sure
to clear the sgid bit on non-directories for non-members (but remember
that sgid without group execute means "mandatory locking", just to
confuse things even more).

Reported-by: Jann Horn 
Cc: Andy Lutomirski 
Cc: Al Viro 
Signed-off-by: Linus Torvalds 
Signed-off-by: Greg Kroah-Hartman

fs: clear writeback errors in inode_init_always

2018-07-08T13:30:53+00:00

[ Upstream commit 829bc787c1a0403e4d886296dd4d90c5f9c1744a ]

In inode_init_always(), we clear the inode mapping flags, which clears
any retained error (AS_EIO, AS_ENOSPC) bits.  Unfortunately, we do not
also clear wb_err, which means that old mapping errors can leak through
to new inodes.

This is crucial for the XFS inode allocation path because we recycle old
in-core inodes and we do not want error state from an old file to leak
into the new file.  This bug was discovered by running generic/036 and
generic/047 in a loop and noticing that the EIOs generated by the
collision of direct and buffered writes in generic/036 would survive the
remount between 036 and 047, and get reported to the fsyncs (on
different files!) in generic/047.

Signed-off-by: Darrick J. Wong 
Reviewed-by: Jeff Layton 
Reviewed-by: Brian Foster 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

2017-09-13T16:11:44+00:00

Pull overlayfs updates from Miklos Szeredi:
 "This fixes d_ino correctness in readdir, which brings overlayfs on par
  with normal filesystems regarding inode number semantics, as long as
  all layers are on the same filesystem.

  There are also some bug fixes, one in particular (random ioctl's
  shouldn't be able to modify lower layers) that touches some vfs code,
  but of course no-op for non-overlay fs"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fix false positive ESTALE on lookup
  ovl: don't allow writing ioctl on lower layer
  ovl: fix relatime for directories
  vfs: add flags to d_real()
  ovl: cleanup d_real for negative
  ovl: constant d_ino for non-merge dirs
  ovl: constant d_ino across copy up
  ovl: fix readdir error value
  ovl: check snprintf return

lib/interval_tree: fast overlap detection

2017-09-09T01:26:49+00:00

Allow interval trees to quickly check for overlaps to avoid unnecesary
tree lookups in interval_tree_iter_first().

As of this patch, all interval tree flavors will require using a
'rb_root_cached' such that we can have the leftmost node easily
available.  While most users will make use of this feature, those with
special functions (in addition to the generic insert, delete, search
calls) will avoid using the cached option as they can do funky things
with insertions -- for example, vma_interval_tree_insert_after().

[jglisse@redhat.com: fix deadlock from typo vm_lock_anon_vma()]
  Link: http://lkml.kernel.org/r/20170808225719.20723-1-jglisse@redhat.com
Link: http://lkml.kernel.org/r/20170719014603.19029-12-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso 
Signed-off-by: Jérôme Glisse 
Acked-by: Christian König 
Acked-by: Peter Zijlstra (Intel) 
Acked-by: Doug Ledford 
Acked-by: Michael S. Tsirkin 
Cc: David Airlie 
Cc: Jason Wang 
Cc: Christian Benvenuti 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

ovl: fix relatime for directories

2017-09-05T10:53:11+00:00

Need to treat non-regular overlayfs files the same as regular files when
checking for an atime update.

Add a d_real() flag to make it return the upper dentry for all file types.

Reported-by: "zhangyi (F)" 
Signed-off-by: Miklos Szeredi

xfs: evict all inodes involved with log redo item

2017-09-01T17:55:30+00:00

When we introduced the bmap redo log items, we set MS_ACTIVE on the
mountpoint and XFS_IRECOVERY on the inode to prevent unlinked inodes
from being truncated prematurely during log recovery.  This also had the
effect of putting linked inodes on the lru instead of evicting them.

Unfortunately, we neglected to find all those unreferenced lru inodes
and evict them after finishing log recovery, which means that we leak
them if anything goes wrong in the rest of xfs_mountfs, because the lru
is only cleaned out on unmount.

Therefore, evict unreferenced inodes in the lru list immediately
after clearing MS_ACTIVE.

Fixes: 17c12bcd30 ("xfs: when replaying bmap operations, don't let unlinked inodes get reaped")
Signed-off-by: Darrick J. Wong 
Cc: viro@ZenIV.linux.org.uk
Reviewed-by: Brian Foster