linux.git/fs/xfs, branch v5.7

xfs: move inode flush to the sync workqueue

2020-04-16T16:07:42+00:00

Move the inode dirty data flushing to a workqueue so that multiple
threads can take advantage of a single thread's flushing work.  The
ratelimiting technique used in bdd4ee4 was not successful, because
threads that skipped the inode flush scan due to ratelimiting would
ENOSPC early, which caused occasional (but noticeable) changes in
behavior and sporadic fstest regressions.

Therefore, make all the writer threads wait on a single inode flush,
which eliminates both the stampeding hordes of flushers and the small
window in which a write could fail with ENOSPC because it lost the
ratelimit race after even another thread freed space.

Fixes: c6425702f21e ("xfs: ratelimit inode flush on buffered write ENOSPC")
Signed-off-by: Darrick J. Wong 
Reviewed-by: Brian Foster

xfs: fix partially uninitialized structure in xfs_reflink_remap_extent

2020-04-13T15:00:23+00:00

In the reflink extent remap function, it turns out that uirec (the block
mapping corresponding only to the part of the passed-in mapping that got
unmapped) was not fully initialized.  Specifically, br_state was not
being copied from the passed-in struct to the uirec.  This could lead to
unpredictable results such as the reflinked mapping being marked
unwritten in the destination file.

Signed-off-by: Darrick J. Wong 
Reviewed-by: Brian Foster

xfs: acquire superblock freeze protection on eofblocks scans

2020-04-13T15:00:19+00:00

The filesystem freeze sequence in XFS waits on any background
eofblocks or cowblocks scans to complete before the filesystem is
quiesced. At this point, the freezer has already stopped the
transaction subsystem, however, which means a truncate or cowblock
cancellation in progress is likely blocked in transaction
allocation. This results in a deadlock between freeze and the
associated scanner.

Fix this problem by holding superblock write protection across calls
into the block reapers. Since protection for background scans is
acquired from the workqueue task context, trylock to avoid a similar
deadlock between freeze and blocking on the write lock.

Fixes: d6b636ebb1c9f ("xfs: halt auto-reclamation activities while rebuilding rmap")
Reported-by: Paul Furtado 
Signed-off-by: Brian Foster 
Reviewed-by: Chandan Rajendra 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Allison Collins 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong

xfs: reflink should force the log out if mounted with wsync

2020-04-06T15:44:39+00:00

Reflink should force the log out to disk if the filesystem was mounted
with wsync, the same as most other operations in xfs.

[Note: XFS_MOUNT_WSYNC is set when the admin mounts the filesystem
with either the 'wsync' or 'sync' mount options, which effectively means
that we're classifying reflink/dedupe as IO operations and making them
synchronous when required.]

Fixes: 3fc9f5e409319 ("xfs: remove xfs_reflink_remap_range")
Signed-off-by: Christoph Hellwig 
Reviewed-by: Brian Foster 
[darrick: add more to the changelog]
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong

xfs: factor out a new xfs_log_force_inode helper

2020-04-06T15:44:35+00:00

Create a new helper to force the log up to the last LSN touching an
inode.

Signed-off-by: Christoph Hellwig 
Reviewed-by: Brian Foster 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong

xfs: fix inode number overflow in ifree cluster helper

2020-04-02T15:19:25+00:00

Qian Cai reports seemingly random buffer read verifier errors during
filesystem writeback. This was isolated to a recent patch that
factored out some inode cluster freeing code and happened to cast an
unsigned inode number type to a signed value. If the inode number
value overflows, we can skip marking in-core inodes associated with
the underlying buffer stale at the time the physical inodes are
freed. If such an inode happens to be dirty, xfsaild will eventually
attempt to write it back over non-inode blocks. The invalidation of
the underlying inode buffer causes writeback to read the buffer from
disk. This fails the read verifier (preventing eventual corruption)
if the buffer no longer looks like an inode cluster. Analysis by
Dave Chinner.

Fix up the helper to use the proper type for inode number values.

Fixes: 5806165a6663 ("xfs: factor inode lookup from xfs_ifree_cluster")
Reported-by: Qian Cai 
Signed-off-by: Brian Foster 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong

xfs: remove redundant variable assignment in xfs_symlink()

2020-03-31T15:42:22+00:00

The variables 'udqp' and 'gdqp' have been initialized, so remove
redundant variable assignment in xfs_symlink().

Signed-off-by: Kaixu Xia 
Reviewed-by: Chaitanya Kulkarni 
Reviewed-by: Dave Chinner 
Signed-off-by: Darrick J. Wong

xfs: ratelimit inode flush on buffered write ENOSPC

2020-03-31T15:41:45+00:00

A customer reported rcu stalls and softlockup warnings on a computer
with many CPU cores and many many more IO threads trying to write to a
filesystem that is totally out of space.  Subsequent analysis pointed to
the many many IO threads calling xfs_flush_inodes -> sync_inodes_sb,
which causes a lot of wb_writeback_work to be queued.  The writeback
worker spends so much time trying to wake the many many threads waiting
for writeback completion that it trips the softlockup detector, and (in
this case) the system automatically reboots.

In addition, they complain that the lengthy xfs_flush_inodes scan traps
all of those threads in uninterruptible sleep, which hampers their
ability to kill the program or do anything else to escape the situation.

If there's thousands of threads trying to write to files on a full
filesystem, each of those threads will start separate copies of the
inode flush scan.  This is kind of pointless since we only need one
scan, so rate limit the inode flush.

Signed-off-by: Darrick J. Wong 
Reviewed-by: Dave Chinner

xfs: return locked status of inode buffer on xfsaild push

2020-03-28T16:40:12+00:00

If the inode buffer backing a particular inode is locked,
xfs_iflush() returns -EAGAIN and xfs_inode_item_push() skips the
inode. It still returns success to xfsaild, however, which bypasses
the xfsaild backoff heuristic. Update xfs_inode_item_push() to
return locked status if the inode buffer couldn't be locked.

Signed-off-by: Brian Foster 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong

xfs: trylock underlying buffer on dquot flush

2020-03-28T16:40:11+00:00

A dquot flush currently blocks on the buffer lock for the underlying
dquot buffer. In turn, this causes xfsaild to block rather than
continue processing other items in the meantime. Update
xfs_qm_dqflush() to trylock the buffer, similar to how inode buffers
are handled, and return -EAGAIN if the lock fails. Fix up any
callers that don't currently handle the error properly.

Signed-off-by: Brian Foster 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong