linux.git/fs/ceph/dir.c, branch v5.11

Revert "ceph: allow rename operation under different quota realms"

2020-12-14T22:21:47+00:00

This reverts commit dffdcd71458e699e839f0bf47c3d42d64210b939.

When doing a rename across quota realms, there's a corner case that isn't
handled correctly.  Here's a testcase:

  mkdir files limit
  truncate files/file -s 10G
  setfattr limit -n ceph.quota.max_bytes -v 1000000
  mv files limit/

The above will succeed because ftruncate(2) won't immediately notify the
MDSs with the new file size, and thus the quota realms stats won't be
updated.

Since the possible fixes for this issue would have a huge performance impact,
the solution for now is to simply revert to returning -EXDEV when doing a cross
quota realms rename.

URL: https://tracker.ceph.com/issues/48203
Signed-off-by: Luis Henriques 
Reviewed-by: Jeff Layton 
Reviewed-by: Ilya Dryomov 
Signed-off-by: Ilya Dryomov

ceph: add ceph_sb_to_mdsc helper support to parse the mdsc

2020-10-12T13:29:26+00:00

This will help simplify the code.

[ jlayton: fix minor merge conflict in quota.c ]

Signed-off-by: Xiubo Li 
Signed-off-by: Jeff Layton 
Signed-off-by: Ilya Dryomov

Merge tag 'ceph-for-5.9-rc3' of git://github.com/ceph/ceph-client

2020-08-28T17:33:04+00:00

Pull ceph fixes from Ilya Dryomov:
 "We have an inode number handling change, prompted by s390x which is a
  64-bit architecture with a 32-bit ino_t, a patch to disallow leases to
  avoid potential data integrity issues when CephFS is re-exported via
  NFS or CIFS and a fix for the bulk of W=1 compilation warnings"

* tag 'ceph-for-5.9-rc3' of git://github.com/ceph/ceph-client:
  ceph: don't allow setlease on cephfs
  ceph: fix inode number handling on arches with 32-bit ino_t
  libceph: add __maybe_unused to DEFINE_CEPH_FEATURE

ceph: fix inode number handling on arches with 32-bit ino_t

2020-08-24T15:25:26+00:00

Tuan and Ulrich mentioned that they were hitting a problem on s390x,
which has a 32-bit ino_t value, even though it's a 64-bit arch (for
historical reasons).

I think the current handling of inode numbers in the ceph driver is
wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's
actually not a problem. 32-bit arches can deal with 64-bit inode numbers
just fine when userland code is compiled with LFS support (the common
case these days).

What we really want to do is just use 64-bit numbers everywhere, unless
someone has mounted with the ino32 mount option. In that case, we want
to ensure that we hash the inode number down to something that will fit
in 32 bits before presenting the value to userland.

Add new helper functions that do this, and only do the conversion before
presenting these values to userland in getattr and readdir.

The inode table hashvalue is changed to just cast the inode number to
unsigned long, as low-order bits are the most likely to vary anyway.

While it's not strictly required, we do want to put something in
inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on
the size of the ino_t type.

NOTE: This is a user-visible change on 32-bit arches:

1/ inode numbers will be seen to have changed between kernel versions.
   32-bit arches will see large inode numbers now instead of the hashed
   ones they saw before.

2/ any really old software not built with LFS support may start failing
   stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we
   can do about these, but hopefully the intersection of people running
   such code on ceph will be very small.

The workaround for both problems is to mount with "-o ino32".

[ idryomov: changelog tweak ]

URL: https://tracker.ceph.com/issues/46828
Reported-by: Ulrich Weigand 
Reported-and-Tested-by: Tuan Hoang1 
Signed-off-by: Jeff Layton 
Reviewed-by: "Yan, Zheng" 
Signed-off-by: Ilya Dryomov

treewide: Use fallthrough pseudo-keyword

2020-08-23T22:36:59+00:00

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva

ceph: set sec_context xattr on symlink creation

2020-08-04T17:41:11+00:00

Symlink inodes should have the security context set in their xattrs on
creation. We already set the context on creation, but we don't attach
the pagelist. The effect is that symlink inodes don't get an SELinux
context set on them at creation, so they end up unlabeled instead of
inheriting the proper context. Make it do so.

Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton 
Reviewed-by: Ilya Dryomov 
Signed-off-by: Ilya Dryomov

ceph: allow rename operation under different quota realms

2020-06-01T11:22:53+00:00

Returning -EXDEV when trying to 'mv' files/directories from different
quota realms results in copy+unlink operations instead of the faster
CEPH_MDS_OP_RENAME.  This will occur even when there aren't any quotas
set in the destination directory, or if there's enough space left for
the new file(s).

This patch adds a new helper function to be called on rename operations
which will allow these operations if they can be executed.  This patch
mimics userland fuse client commit b8954e5734b3 ("client:
optimize rename operation under different quota root").

Since ceph_quota_is_same_realm() is now called only from this new
helper, make it static.

URL: https://tracker.ceph.com/issues/44791
Signed-off-by: Luis Henriques 
Reviewed-by: Jeff Layton 
Signed-off-by: Ilya Dryomov

ceph: add caps perf metric for each superblock

2020-06-01T11:22:51+00:00

Count hits and misses in the caps cache. If the client has all of
the necessary caps when a task needs references, then it's counted
as a hit. Any other situation is a miss.

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li 
Reviewed-by: Jeff Layton 
Signed-off-by: Ilya Dryomov

ceph: add dentry lease metric support

2020-06-01T11:22:51+00:00

For dentry leases, only count the hit/miss info triggered from the vfs
calls. For the cases like request reply handling and ceph_trim_dentries,
ignore them.

For now, these are only viewable using debugfs. Future patches will
allow the client to send the stats to the MDS.

The output looks like:

item          total           miss            hit
-------------------------------------------------
d_lease       11              7               141

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li 
Reviewed-by: Jeff Layton 
Signed-off-by: Ilya Dryomov

ceph: fix potential bad pointer deref in async dirops cb's

2020-04-13T17:33:47+00:00

The new async dirops callback routines can pass ERR_PTR values to
ceph_mdsc_free_path, which could cause an oops. Make ceph_mdsc_free_path
ignore ERR_PTR values. Also, ensure that the pr_warn messages look sane
even if ceph_mdsc_build_path fails.

Reported-by: Dan Carpenter 
Signed-off-by: Jeff Layton 
Reviewed-by: Ilya Dryomov 
Signed-off-by: Ilya Dryomov