<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/overlayfs, branch v6.9</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>ovl: relax WARN_ON in ovl_verify_area()</title>
<updated>2024-03-17T13:59:41+00:00</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2024-03-17T11:59:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=77a28aa476873048024ad56daf8f4f17d58ee48e'/>
<id>77a28aa476873048024ad56daf8f4f17d58ee48e</id>
<content type='text'>
syzbot hit an assertion in copy up data loop which looks like it is
the result of a lower file whose size is being changed underneath
overlayfs.

This type of use case is documented to cause undefined behavior, so
returning EIO error for the copy up makes sense, but it should not be
causing a WARN_ON assertion.

Reported-and-tested-by: syzbot+3abd99031b42acf367ef@syzkaller.appspotmail.com
Fixes: ca7ab482401c ("ovl: add permission hooks outside of do_splice_direct()")
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
syzbot hit an assertion in copy up data loop which looks like it is
the result of a lower file whose size is being changed underneath
overlayfs.

This type of use case is documented to cause undefined behavior, so
returning EIO error for the copy up makes sense, but it should not be
causing a WARN_ON assertion.

Reported-and-tested-by: syzbot+3abd99031b42acf367ef@syzkaller.appspotmail.com
Fixes: ca7ab482401c ("ovl: add permission hooks outside of do_splice_direct()")
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, slab: remove last vestiges of SLAB_MEM_SPREAD</title>
<updated>2024-03-13T03:32:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-13T03:32:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f88c3fb81c4badb46c2fef7d168ff138043e86bb'/>
<id>f88c3fb81c4badb46c2fef7d168ff138043e86bb</id>
<content type='text'>
Yes, yes, I know the slab people were planning on going slow and letting
every subsystem fight this thing on their own.  But let's just rip off
the band-aid and get it over and done with.  I don't want to see a
number of unnecessary pull requests just to get rid of a flag that no
longer has any meaning.

This was mainly done with a couple of 'sed' scripts and then some manual
cleanup of the end result.

Link: https://lore.kernel.org/all/CAHk-=wji0u+OOtmAOD-5JV3SXcRJF___k_+8XNKmak0yd5vW1Q@mail.gmail.com/
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Yes, yes, I know the slab people were planning on going slow and letting
every subsystem fight this thing on their own.  But let's just rip off
the band-aid and get it over and done with.  I don't want to see a
number of unnecessary pull requests just to get rid of a flag that no
longer has any meaning.

This was mainly done with a couple of 'sed' scripts and then some manual
cleanup of the end result.

Link: https://lore.kernel.org/all/CAHk-=wji0u+OOtmAOD-5JV3SXcRJF___k_+8XNKmak0yd5vW1Q@mail.gmail.com/
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.9.uuid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-03-11T18:02:06+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-11T18:02:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0f1a876682f0979d6a1e5f86861dd562d1758936'/>
<id>0f1a876682f0979d6a1e5f86861dd562d1758936</id>
<content type='text'>
Pull vfs uuid updates from Christian Brauner:
 "This adds two new ioctl()s for getting the filesystem uuid and
  retrieving the sysfs path based on the path of a mounted filesystem.
  Getting the filesystem uuid has been implemented in filesystem
  specific code for a while it's now lifted as a generic ioctl"

* tag 'vfs-6.9.uuid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  xfs: add support for FS_IOC_GETFSSYSFSPATH
  fs: add FS_IOC_GETFSSYSFSPATH
  fat: Hook up sb-&gt;s_uuid
  fs: FS_IOC_GETUUID
  ovl: convert to super_set_uuid()
  fs: super_set_uuid()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull vfs uuid updates from Christian Brauner:
 "This adds two new ioctl()s for getting the filesystem uuid and
  retrieving the sysfs path based on the path of a mounted filesystem.
  Getting the filesystem uuid has been implemented in filesystem
  specific code for a while it's now lifted as a generic ioctl"

* tag 'vfs-6.9.uuid' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  xfs: add support for FS_IOC_GETFSSYSFSPATH
  fs: add FS_IOC_GETFSSYSFSPATH
  fat: Hook up sb-&gt;s_uuid
  fs: FS_IOC_GETUUID
  ovl: convert to super_set_uuid()
  fs: super_set_uuid()
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-03-11T16:38:17+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-11T16:38:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7ea65c89d864f1e9aa892eec85625a96fa9acee6'/>
<id>7ea65c89d864f1e9aa892eec85625a96fa9acee6</id>
<content type='text'>
Pull misc vfs updates from Christian Brauner:
 "Misc features, cleanups, and fixes for vfs and individual filesystems.

  Features:

   - Support idmapped mounts for hugetlbfs.

   - Add RWF_NOAPPEND flag for pwritev2(). This allows us to fix a bug
     where the passed offset is ignored if the file is O_APPEND. The new
     flag allows a caller to enforce that the offset is honored to
     conform to posix even if the file was opened in append mode.

   - Move i_mmap_rwsem in struct address_space to avoid false sharing
     between i_mmap and i_mmap_rwsem.

   - Convert efs, qnx4, and coda to use the new mount api.

   - Add a generic is_dot_dotdot() helper that's used by various
     filesystems and the VFS code instead of open-coding it multiple
     times.

   - Recently we've added stable offsets which allows stable ordering
     when iterating directories exported through NFS on e.g., tmpfs
     filesystems. Originally an xarray was used for the offset map but
     that caused slab fragmentation issues over time. This switches the
     offset map to the maple tree which has a dense mode that handles
     this scenario a lot better. Includes tests.

   - Finally merge the case-insensitive improvement series Gabriel has
     been working on for a long time. This cleanly propagates case
     insensitive operations through -&gt;s_d_op which in turn allows us to
     remove the quite ugly generic_set_encrypted_ci_d_ops() operations.
     It also improves performance by trying a case-sensitive comparison
     first and then fallback to case-insensitive lookup if that fails.
     This also fixes a bug where overlayfs would be able to be mounted
     over a case insensitive directory which would lead to all sort of
     odd behaviors.

  Cleanups:

   - Make file_dentry() a simple accessor now that -&gt;d_real() is
     simplified because of the backing file work we did the last two
     cycles.

   - Use the dedicated file_mnt_idmap helper in ntfs3.

   - Use smp_load_acquire/store_release() in the i_size_read/write
     helpers and thus remove the hack to handle i_size reads in the
     filemap code.

   - The SLAB_MEM_SPREAD is a nop now. Remove it from various places in
     fs/

   - It's no longer necessary to perform a second built-in initramfs
     unpack call because we retain the contents of the previous
     extraction. Remove it.

   - Now that we have removed various allocators kfree_rcu() always
     works with kmem caches and kmalloc(). So simplify various places
     that only use an rcu callback in order to handle the kmem cache
     case.

   - Convert the pipe code to use a lockdep comparison function instead
     of open-coding the nesting making lockdep validation easier.

   - Move code into fs-writeback.c that was located in a header but can
     be made static as it's only used in that one file.

   - Rewrite the alignment checking iterators for iovec and bvec to be
     easier to read, and also significantly more compact in terms of
     generated code. This saves 270 bytes of text on x86-64 (with
     clang-18) and 224 bytes on arm64 (with gcc-13). In profiles it also
     saves a bit of time for the same workload.

   - Switch various places to use KMEM_CACHE instead of
     kmem_cache_create().

   - Use inode_set_ctime_to_ts() in inode_set_ctime_current()

   - Use kzalloc() in name_to_handle_at() to avoid kernel infoleak.

   - Various smaller cleanups for eventfds.

  Fixes:

   - Fix various comments and typos, and unneeded initializations.

   - Fix stack allocation hack for clang in the select code.

   - Improve dump_mapping() debug code on a best-effort basis.

   - Fix build errors in various selftests.

   - Avoid wrap-around instrumentation in various places.

   - Don't allow user namespaces without an idmapping to be used for
     idmapped mounts.

   - Fix sysv sb_read() call.

   - Fix fallback implementation of the get_name() export operation"

* tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (70 commits)
  hugetlbfs: support idmapped mounts
  qnx4: convert qnx4 to use the new mount api
  fs: use inode_set_ctime_to_ts to set inode ctime to current time
  libfs: Drop generic_set_encrypted_ci_d_ops
  ubifs: Configure dentry operations at dentry-creation time
  f2fs: Configure dentry operations at dentry-creation time
  ext4: Configure dentry operations at dentry-creation time
  libfs: Add helper to choose dentry operations at mount-time
  libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
  fscrypt: Drop d_revalidate once the key is added
  fscrypt: Drop d_revalidate for valid dentries during lookup
  fscrypt: Factor out a helper to configure the lookup dentry
  ovl: Always reject mounting over case-insensitive directories
  libfs: Attempt exact-match comparison first during casefolded lookup
  efs: remove SLAB_MEM_SPREAD flag usage
  jfs: remove SLAB_MEM_SPREAD flag usage
  minix: remove SLAB_MEM_SPREAD flag usage
  openpromfs: remove SLAB_MEM_SPREAD flag usage
  proc: remove SLAB_MEM_SPREAD flag usage
  qnx6: remove SLAB_MEM_SPREAD flag usage
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull misc vfs updates from Christian Brauner:
 "Misc features, cleanups, and fixes for vfs and individual filesystems.

  Features:

   - Support idmapped mounts for hugetlbfs.

   - Add RWF_NOAPPEND flag for pwritev2(). This allows us to fix a bug
     where the passed offset is ignored if the file is O_APPEND. The new
     flag allows a caller to enforce that the offset is honored to
     conform to posix even if the file was opened in append mode.

   - Move i_mmap_rwsem in struct address_space to avoid false sharing
     between i_mmap and i_mmap_rwsem.

   - Convert efs, qnx4, and coda to use the new mount api.

   - Add a generic is_dot_dotdot() helper that's used by various
     filesystems and the VFS code instead of open-coding it multiple
     times.

   - Recently we've added stable offsets which allows stable ordering
     when iterating directories exported through NFS on e.g., tmpfs
     filesystems. Originally an xarray was used for the offset map but
     that caused slab fragmentation issues over time. This switches the
     offset map to the maple tree which has a dense mode that handles
     this scenario a lot better. Includes tests.

   - Finally merge the case-insensitive improvement series Gabriel has
     been working on for a long time. This cleanly propagates case
     insensitive operations through -&gt;s_d_op which in turn allows us to
     remove the quite ugly generic_set_encrypted_ci_d_ops() operations.
     It also improves performance by trying a case-sensitive comparison
     first and then fallback to case-insensitive lookup if that fails.
     This also fixes a bug where overlayfs would be able to be mounted
     over a case insensitive directory which would lead to all sort of
     odd behaviors.

  Cleanups:

   - Make file_dentry() a simple accessor now that -&gt;d_real() is
     simplified because of the backing file work we did the last two
     cycles.

   - Use the dedicated file_mnt_idmap helper in ntfs3.

   - Use smp_load_acquire/store_release() in the i_size_read/write
     helpers and thus remove the hack to handle i_size reads in the
     filemap code.

   - The SLAB_MEM_SPREAD is a nop now. Remove it from various places in
     fs/

   - It's no longer necessary to perform a second built-in initramfs
     unpack call because we retain the contents of the previous
     extraction. Remove it.

   - Now that we have removed various allocators kfree_rcu() always
     works with kmem caches and kmalloc(). So simplify various places
     that only use an rcu callback in order to handle the kmem cache
     case.

   - Convert the pipe code to use a lockdep comparison function instead
     of open-coding the nesting making lockdep validation easier.

   - Move code into fs-writeback.c that was located in a header but can
     be made static as it's only used in that one file.

   - Rewrite the alignment checking iterators for iovec and bvec to be
     easier to read, and also significantly more compact in terms of
     generated code. This saves 270 bytes of text on x86-64 (with
     clang-18) and 224 bytes on arm64 (with gcc-13). In profiles it also
     saves a bit of time for the same workload.

   - Switch various places to use KMEM_CACHE instead of
     kmem_cache_create().

   - Use inode_set_ctime_to_ts() in inode_set_ctime_current()

   - Use kzalloc() in name_to_handle_at() to avoid kernel infoleak.

   - Various smaller cleanups for eventfds.

  Fixes:

   - Fix various comments and typos, and unneeded initializations.

   - Fix stack allocation hack for clang in the select code.

   - Improve dump_mapping() debug code on a best-effort basis.

   - Fix build errors in various selftests.

   - Avoid wrap-around instrumentation in various places.

   - Don't allow user namespaces without an idmapping to be used for
     idmapped mounts.

   - Fix sysv sb_read() call.

   - Fix fallback implementation of the get_name() export operation"

* tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (70 commits)
  hugetlbfs: support idmapped mounts
  qnx4: convert qnx4 to use the new mount api
  fs: use inode_set_ctime_to_ts to set inode ctime to current time
  libfs: Drop generic_set_encrypted_ci_d_ops
  ubifs: Configure dentry operations at dentry-creation time
  f2fs: Configure dentry operations at dentry-creation time
  ext4: Configure dentry operations at dentry-creation time
  libfs: Add helper to choose dentry operations at mount-time
  libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
  fscrypt: Drop d_revalidate once the key is added
  fscrypt: Drop d_revalidate for valid dentries during lookup
  fscrypt: Factor out a helper to configure the lookup dentry
  ovl: Always reject mounting over case-insensitive directories
  libfs: Attempt exact-match comparison first during casefolded lookup
  efs: remove SLAB_MEM_SPREAD flag usage
  jfs: remove SLAB_MEM_SPREAD flag usage
  minix: remove SLAB_MEM_SPREAD flag usage
  openpromfs: remove SLAB_MEM_SPREAD flag usage
  proc: remove SLAB_MEM_SPREAD flag usage
  qnx6: remove SLAB_MEM_SPREAD flag usage
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-next-6.9' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/krisman/unicode into vfs.misc</title>
<updated>2024-03-07T10:55:41+00:00</updated>
<author>
<name>Christian Brauner</name>
<email>brauner@kernel.org</email>
</author>
<published>2024-03-07T10:55:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=09406ad8e5105729291a7639160e0cd51c9e0c6c'/>
<id>09406ad8e5105729291a7639160e0cd51c9e0c6c</id>
<content type='text'>
Merge case-insensitive updates from Gabriel Krisman Bertazi:

- Patch case-insensitive lookup by trying the case-exact comparison
  first, before falling back to costly utf8 casefolded comparison.

- Fix to forbid using a case-insensitive directory as part of an
  overlayfs mount.

- Patchset to ensure d_op are set at d_alloc time for fscrypt and
  casefold volumes, ensuring filesystem dentries will all have the
  correct ops, whether they come from a lookup or not.

* tag 'for-next-6.9' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/krisman/unicode:
  libfs: Drop generic_set_encrypted_ci_d_ops
  ubifs: Configure dentry operations at dentry-creation time
  f2fs: Configure dentry operations at dentry-creation time
  ext4: Configure dentry operations at dentry-creation time
  libfs: Add helper to choose dentry operations at mount-time
  libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
  fscrypt: Drop d_revalidate once the key is added
  fscrypt: Drop d_revalidate for valid dentries during lookup
  fscrypt: Factor out a helper to configure the lookup dentry
  ovl: Always reject mounting over case-insensitive directories
  libfs: Attempt exact-match comparison first during casefolded lookup

Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge case-insensitive updates from Gabriel Krisman Bertazi:

- Patch case-insensitive lookup by trying the case-exact comparison
  first, before falling back to costly utf8 casefolded comparison.

- Fix to forbid using a case-insensitive directory as part of an
  overlayfs mount.

- Patchset to ensure d_op are set at d_alloc time for fscrypt and
  casefold volumes, ensuring filesystem dentries will all have the
  correct ops, whether they come from a lookup or not.

* tag 'for-next-6.9' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/krisman/unicode:
  libfs: Drop generic_set_encrypted_ci_d_ops
  ubifs: Configure dentry operations at dentry-creation time
  f2fs: Configure dentry operations at dentry-creation time
  ext4: Configure dentry operations at dentry-creation time
  libfs: Add helper to choose dentry operations at mount-time
  libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
  fscrypt: Drop d_revalidate once the key is added
  fscrypt: Drop d_revalidate for valid dentries during lookup
  fscrypt: Factor out a helper to configure the lookup dentry
  ovl: Always reject mounting over case-insensitive directories
  libfs: Attempt exact-match comparison first during casefolded lookup

Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ovl: Always reject mounting over case-insensitive directories</title>
<updated>2024-02-27T21:55:34+00:00</updated>
<author>
<name>Gabriel Krisman Bertazi</name>
<email>krisman@suse.de</email>
</author>
<published>2024-02-21T17:14:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2824083db76cb9d4b7910607b367e93b02912865'/>
<id>2824083db76cb9d4b7910607b367e93b02912865</id>
<content type='text'>
overlayfs relies on the filesystem setting DCACHE_OP_HASH or
DCACHE_OP_COMPARE to reject mounting over case-insensitive directories.

Since commit bb9cd9106b22 ("fscrypt: Have filesystems handle their
d_ops"), we set -&gt;d_op through a hook in -&gt;d_lookup, which
means the root dentry won't have them, causing the mount to accidentally
succeed.

In v6.7-rc7, the following sequence will succeed to mount, but any
dentry other than the root dentry will be a "weird" dentry to ovl and
fail with EREMOTE.

  mkfs.ext4 -O casefold lower.img
  mount -O loop lower.img lower
  mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work ovl /mnt

Mounting on a subdirectory fails, as expected, because DCACHE_OP_HASH
and DCACHE_OP_COMPARE are properly set by -&gt;lookup.

Fix by explicitly rejecting superblocks that allow case-insensitive
dentries. Yes, this will be solved when we move d_op configuration back
to -&gt;s_d_op. Yet, we better have an explicit fix to avoid messing up
again.

While there, re-sort the entries to have more descriptive error messages
first.

Fixes: bb9cd9106b22 ("fscrypt: Have filesystems handle their d_ops")
Acked-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Link: https://lore.kernel.org/r/20240221171412.10710-2-krisman@suse.de
Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
overlayfs relies on the filesystem setting DCACHE_OP_HASH or
DCACHE_OP_COMPARE to reject mounting over case-insensitive directories.

Since commit bb9cd9106b22 ("fscrypt: Have filesystems handle their
d_ops"), we set -&gt;d_op through a hook in -&gt;d_lookup, which
means the root dentry won't have them, causing the mount to accidentally
succeed.

In v6.7-rc7, the following sequence will succeed to mount, but any
dentry other than the root dentry will be a "weird" dentry to ovl and
fail with EREMOTE.

  mkfs.ext4 -O casefold lower.img
  mount -O loop lower.img lower
  mount -t overlay -o lowerdir=lower,upperdir=upper,workdir=work ovl /mnt

Mounting on a subdirectory fails, as expected, because DCACHE_OP_HASH
and DCACHE_OP_COMPARE are properly set by -&gt;lookup.

Fix by explicitly rejecting superblocks that allow case-insensitive
dentries. Yes, this will be solved when we move d_op configuration back
to -&gt;s_d_op. Yet, we better have an explicit fix to avoid messing up
again.

While there, re-sort the entries to have more descriptive error messages
first.

Fixes: bb9cd9106b22 ("fscrypt: Have filesystems handle their d_ops")
Acked-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Reviewed-by: Eric Biggers &lt;ebiggers@google.com&gt;
Link: https://lore.kernel.org/r/20240221171412.10710-2-krisman@suse.de
Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.8-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-02-12T15:15:45+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-02-12T15:15:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=716f4aaa7b48a55c73d632d0657b35342b1fefd7'/>
<id>716f4aaa7b48a55c73d632d0657b35342b1fefd7</id>
<content type='text'>
Pull vfs fixes from Christian Brauner:

 - Fix performance regression introduced by moving the security
   permission hook out of do_clone_file_range() and into its caller
   vfs_clone_file_range().

   This causes the security hook to be called in situation were it
   wasn't called before as the fast permission checks were left in
   do_clone_file_range().

   Fix this by merging the two implementations back together and
   restoring the old ordering: fast permission checks first, expensive
   ones later.

 - Tweak mount_setattr() permission checking so that mount properties on
   the real rootfs can be changed.

   When we added mount_setattr() we added additional checks compared to
   legacy mount(2). If the mount had a parent then verify that the
   caller and the mount namespace the mount is attached to match and if
   not make sure that it's an anonymous mount.

   But the real rootfs falls into neither category. It is neither an
   anoymous mount because it is obviously attached to the initial mount
   namespace but it also obviously doesn't have a parent mount. So that
   means legacy mount(2) allows changing mount properties on the real
   rootfs but mount_setattr(2) blocks this. This causes regressions (See
   the commit for details).

   Fix this by relaxing the check. If the mount has a parent or if it
   isn't a detached mount, verify that the mount namespaces of the
   caller and the mount are the same. Technically, we could probably
   write this even simpler and check that the mount namespaces match if
   it isn't a detached mount. But the slightly longer check makes it
   clearer what conditions one needs to think about.

* tag 'vfs-6.8-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: relax mount_setattr() permission checks
  remap_range: merge do_clone_file_range() into vfs_clone_file_range()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull vfs fixes from Christian Brauner:

 - Fix performance regression introduced by moving the security
   permission hook out of do_clone_file_range() and into its caller
   vfs_clone_file_range().

   This causes the security hook to be called in situation were it
   wasn't called before as the fast permission checks were left in
   do_clone_file_range().

   Fix this by merging the two implementations back together and
   restoring the old ordering: fast permission checks first, expensive
   ones later.

 - Tweak mount_setattr() permission checking so that mount properties on
   the real rootfs can be changed.

   When we added mount_setattr() we added additional checks compared to
   legacy mount(2). If the mount had a parent then verify that the
   caller and the mount namespace the mount is attached to match and if
   not make sure that it's an anonymous mount.

   But the real rootfs falls into neither category. It is neither an
   anoymous mount because it is obviously attached to the initial mount
   namespace but it also obviously doesn't have a parent mount. So that
   means legacy mount(2) allows changing mount properties on the real
   rootfs but mount_setattr(2) blocks this. This causes regressions (See
   the commit for details).

   Fix this by relaxing the check. If the mount has a parent or if it
   isn't a detached mount, verify that the mount namespaces of the
   caller and the mount are the same. Technically, we could probably
   write this even simpler and check that the mount namespaces match if
   it isn't a detached mount. But the slightly longer check makes it
   clearer what conditions one needs to think about.

* tag 'vfs-6.8-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: relax mount_setattr() permission checks
  remap_range: merge do_clone_file_range() into vfs_clone_file_range()
</pre>
</div>
</content>
</entry>
<entry>
<title>ovl: convert to super_set_uuid()</title>
<updated>2024-02-08T20:20:11+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2024-02-07T02:56:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dd9019604cdaaff1327bfd36737230f912e2780d'/>
<id>dd9019604cdaaff1327bfd36737230f912e2780d</id>
<content type='text'>
We don't want to be settingc sb-&gt;s_uuid directly anymore, as there's a
length field that also has to be set, and this conversion was not
completely trivial.

Acked-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Link: https://lore.kernel.org/r/20240207025624.1019754-3-kent.overstreet@linux.dev
Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Cc: linux-unionfs@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We don't want to be settingc sb-&gt;s_uuid directly anymore, as there's a
length field that also has to be set, and this conversion was not
completely trivial.

Acked-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Link: https://lore.kernel.org/r/20240207025624.1019754-3-kent.overstreet@linux.dev
Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Cc: linux-unionfs@vger.kernel.org
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>remap_range: merge do_clone_file_range() into vfs_clone_file_range()</title>
<updated>2024-02-06T16:07:21+00:00</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2024-02-02T10:22:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=853b8d7597eea4ccaaefbcf0942cd42fc86d542a'/>
<id>853b8d7597eea4ccaaefbcf0942cd42fc86d542a</id>
<content type='text'>
commit dfad37051ade ("remap_range: move permission hooks out of
do_clone_file_range()") moved the permission hooks from
do_clone_file_range() out to its caller vfs_clone_file_range(),
but left all the fast sanity checks in do_clone_file_range().

This makes the expensive security hooks be called in situations
that they would not have been called before (e.g. fs does not support
clone).

The only reason for the do_clone_file_range() helper was that overlayfs
did not use to be able to call vfs_clone_file_range() from copy up
context with sb_writers lock held.  However, since commit c63e56a4a652
("ovl: do not open/llseek lower file with upper sb_writers held"),
overlayfs just uses an open coded version of vfs_clone_file_range().

Merge_clone_file_range() into vfs_clone_file_range(), restoring the
original order of checks as it was before the regressing commit and adapt
the overlayfs code to call vfs_clone_file_range() before the permission
hooks that were added by commit ca7ab482401c ("ovl: add permission hooks
outside of do_splice_direct()").

Note that in the merge of do_clone_file_range(), the file_start_write()
context was reduced to cover -&gt;remap_file_range() without holding it
over the permission hooks, which was the reason for doing the regressing
commit in the first place.

Reported-and-tested-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Closes: https://lore.kernel.org/oe-lkp/202401312229.eddeb9a6-oliver.sang@intel.com
Fixes: dfad37051ade ("remap_range: move permission hooks out of do_clone_file_range()")
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Link: https://lore.kernel.org/r/20240202102258.1582671-1-amir73il@gmail.com
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit dfad37051ade ("remap_range: move permission hooks out of
do_clone_file_range()") moved the permission hooks from
do_clone_file_range() out to its caller vfs_clone_file_range(),
but left all the fast sanity checks in do_clone_file_range().

This makes the expensive security hooks be called in situations
that they would not have been called before (e.g. fs does not support
clone).

The only reason for the do_clone_file_range() helper was that overlayfs
did not use to be able to call vfs_clone_file_range() from copy up
context with sb_writers lock held.  However, since commit c63e56a4a652
("ovl: do not open/llseek lower file with upper sb_writers held"),
overlayfs just uses an open coded version of vfs_clone_file_range().

Merge_clone_file_range() into vfs_clone_file_range(), restoring the
original order of checks as it was before the regressing commit and adapt
the overlayfs code to call vfs_clone_file_range() before the permission
hooks that were added by commit ca7ab482401c ("ovl: add permission hooks
outside of do_splice_direct()").

Note that in the merge of do_clone_file_range(), the file_start_write()
context was reduced to cover -&gt;remap_file_range() without holding it
over the permission hooks, which was the reason for doing the regressing
commit in the first place.

Reported-and-tested-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Closes: https://lore.kernel.org/oe-lkp/202401312229.eddeb9a6-oliver.sang@intel.com
Fixes: dfad37051ade ("remap_range: move permission hooks out of do_clone_file_range()")
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Link: https://lore.kernel.org/r/20240202102258.1582671-1-amir73il@gmail.com
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: remove the inode argument to -&gt;d_real() method</title>
<updated>2024-02-06T16:00:12+00:00</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2024-02-02T11:01:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=11b3f8ae7081607a783d60e8098d46b787f79cad'/>
<id>11b3f8ae7081607a783d60e8098d46b787f79cad</id>
<content type='text'>
The only remaining user of -&gt;d_real() method is d_real_inode(), which
passed NULL inode argument to get the real data dentry.

There are no longer any users that call -&gt;d_real() with a non-NULL
inode argument for getting a detry from a specific underlying layer.

Remove the inode argument of the method and replace it with an integer
'type' argument, to allow callers to request the real metadata dentry
instead of the real data dentry.

All the current users of d_real_inode() (e.g. uprobe) continue to get
the real data inode.  Caller that need to get the real metadata inode
(e.g. IMA/EVM) can use d_inode(d_real(dentry, D_REAL_METADATA)).

Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Link: https://lore.kernel.org/r/20240202110132.1584111-3-amir73il@gmail.com
Tested-by: Stefan Berger &lt;stefanb@linux.ibm.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The only remaining user of -&gt;d_real() method is d_real_inode(), which
passed NULL inode argument to get the real data dentry.

There are no longer any users that call -&gt;d_real() with a non-NULL
inode argument for getting a detry from a specific underlying layer.

Remove the inode argument of the method and replace it with an integer
'type' argument, to allow callers to request the real metadata dentry
instead of the real data dentry.

All the current users of d_real_inode() (e.g. uprobe) continue to get
the real data inode.  Caller that need to get the real metadata inode
(e.g. IMA/EVM) can use d_inode(d_real(dentry, D_REAL_METADATA)).

Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Link: https://lore.kernel.org/r/20240202110132.1584111-3-amir73il@gmail.com
Tested-by: Stefan Berger &lt;stefanb@linux.ibm.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
