linux.git/fs/xfs, branch v6.14

Merge tag 'fsnotify_for_v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

2025-03-15T18:32:16+00:00

Pull fsnotify reverts from Jan Kara:
 "Syzbot has found out that fsnotify HSM events generated on page fault
  can be generated while we already hold freeze protection for the
  filesystem (when you do buffered write from a buffer which is mmapped
  file on the same filesystem) which violates expectations for HSM
  events and could lead to deadlocks of HSM clients with filesystem
  freezing.

  Since it's quite late in the cycle we've decided to revert changes
  implementing HSM events on page fault for now and instead just
  generate one event for the whole range on mmap(2) so that HSM client
  can fetch the data at that moment"

* tag 'fsnotify_for_v6.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  Revert "fanotify: disable readahead if we have pre-content watches"
  Revert "mm: don't allow huge faults for files with pre content watches"
  Revert "fsnotify: generate pre-content permission event on page fault"
  Revert "xfs: add pre-content fsnotify hook for DAX faults"
  Revert "ext4: add pre-content fsnotify hook for DAX faults"
  fsnotify: add pre-content hooks on mmap()

xfs: Use abs_diff instead of XFS_ABSDIFF

2025-03-14T12:40:17+00:00

We have a central definition for this function since 2023, used by
a number of different parts of the kernel.

Signed-off-by: Matthew Wilcox (Oracle) 
Reviewed-by: Carlos Maiolino 
Reviewed-by: Eric Sandeen 
Signed-off-by: Carlos Maiolino

Revert "xfs: add pre-content fsnotify hook for DAX faults"

2025-03-13T15:30:40+00:00

This reverts commit 7f4796a46571ced5d3d5b0942e1bfea1eedaaecd.

Signed-off-by: Amir Goldstein 
Signed-off-by: Jan Kara 
Link: https://patch.msgid.link/20250312073852.2123409-4-amir73il@gmail.com

xfs: remove the XBF_STALE check from xfs_buf_rele_cached

2025-02-25T12:05:59+00:00

xfs_buf_stale already set b_lru_ref to 0, and thus prevents the buffer
from moving to the LRU.  Remove the duplicate check.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Dave Chinner 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Carlos Maiolino

xfs: remove most in-flight buffer accounting

2025-02-25T12:05:59+00:00

The buffer cache keeps a bt_io_count per-CPU counter to track all
in-flight I/O, which is used to ensure no I/O is in flight when
unmounting the file system.

For most I/O we already keep track of inflight I/O at higher levels:

 - for synchronous I/O (xfs_buf_read/xfs_bwrite/xfs_buf_delwri_submit),
   the caller has a reference and waits for I/O completions using
   xfs_buf_iowait
 - for xfs_buf_delwri_submit_nowait the only caller (AIL writeback)
   tracks the log items that the buffer attached to

This only leaves only xfs_buf_readahead_map as a submitter of
asynchronous I/O that is not tracked by anything else.  Replace the
bt_io_count per-cpu counter with a more specific bt_readahead_count
counter only tracking readahead I/O.  This allows to simply increment
it when submitting readahead I/O and decrementing it when it completed,
and thus simplify xfs_buf_rele and remove the needed for the
XBF_NO_IOACCT flags and the XFS_BSTATE_IN_FLIGHT buffer state.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Dave Chinner 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Carlos Maiolino

xfs: decouple buffer readahead from the normal buffer read path

2025-02-25T12:05:59+00:00

xfs_buf_readahead_map is the only caller of xfs_buf_read_map and thus
_xfs_buf_read that is not synchronous.  Split it from xfs_buf_read_map
so that the asynchronous path is self-contained and the now purely
synchronous xfs_buf_read_map / _xfs_buf_read implementation can be
simplified.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Dave Chinner 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Carlos Maiolino

xfs: reduce context switches for synchronous buffered I/O

2025-02-25T12:05:59+00:00

Currently all metadata I/O completions happen in the m_buf_workqueue
workqueue.  But for synchronous I/O (i.e. all buffer reads) there is no
need for that, as there always is a called in process context that is
waiting for the I/O.  Factor out the guts of xfs_buf_ioend into a
separate helper and call it from xfs_buf_iowait to avoid a double
an extra context switch to the workqueue.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Dave Chinner 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Carlos Maiolino

xfs: flush inodegc before swapon

2025-02-14T08:40:35+00:00

Fix the brand new xfstest that tries to swapon on a recently unshared
file and use the chance to document the other bit of magic in this
function.

The big comment is taken from a mailinglist post by Dave Chinner.

Fixes: 5e672cd69f0a53 ("xfs: introduce xfs_inodegc_push()")
Signed-off-by: Christoph Hellwig 
Reviewed-by: Darrick J. Wong 
Reviewed-by: Dave Chinner 
Signed-off-by: Carlos Maiolino

xfs: rename xfs_iomap_swapfile_activate to xfs_vm_swap_activate

2025-02-14T08:40:35+00:00

Match the method name and the naming convention or address_space
operations.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Darrick J. Wong 
Reviewed-by: Dave Chinner 
Signed-off-by: Carlos Maiolino

xfs: Do not allow norecovery mount with quotacheck

2025-02-14T08:40:35+00:00

Mounting a filesystem that requires quota state changing will generate a
transaction.

We already check for a read-only device; we should do that for
norecovery too.

A quotacheck on a norecovery mount, and with the right log size, will cause
the mount process to hang on:

[<0>] xlog_grant_head_wait+0x5d/0x2a0 [xfs]
[<0>] xlog_grant_head_check+0x112/0x180 [xfs]
[<0>] xfs_log_reserve+0xe3/0x260 [xfs]
[<0>] xfs_trans_reserve+0x179/0x250 [xfs]
[<0>] xfs_trans_alloc+0x101/0x260 [xfs]
[<0>] xfs_sync_sb+0x3f/0x80 [xfs]
[<0>] xfs_qm_mount_quotas+0xe3/0x2f0 [xfs]
[<0>] xfs_mountfs+0x7ad/0xc20 [xfs]
[<0>] xfs_fs_fill_super+0x762/0xa50 [xfs]
[<0>] get_tree_bdev_flags+0x131/0x1d0
[<0>] vfs_get_tree+0x26/0xd0
[<0>] vfs_cmd_create+0x59/0xe0
[<0>] __do_sys_fsconfig+0x4e3/0x6b0
[<0>] do_syscall_64+0x82/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

This is caused by a transaction running with bogus initialized head/tail

I initially hit this while running generic/050, with random log
sizes, but I managed to reproduce it reliably here with the steps
below:

mkfs.xfs -f -lsize=1025M -f -b size=4096 -m crc=1,reflink=1,rmapbt=1, -i
sparse=1 /dev/vdb2 > /dev/null
mount -o usrquota,grpquota,prjquota /dev/vdb2 /mnt
xfs_io -x -c 'shutdown -f' /mnt
umount /mnt
mount -o ro,norecovery,usrquota,grpquota,prjquota  /dev/vdb2 /mnt

Last mount hangs up

As we add yet another validation if quota state is changing, this also
add a new helper named xfs_qm_validate_state_change(), factoring the
quota state changes out of xfs_qm_newmount() to reduce cluttering
within it.

Signed-off-by: Carlos Maiolino 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Carlos Maiolino