linux-stable.git/fs, branch v3.4.110

Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount

2015-10-22T01:20:08+00:00

commit a41cbe86df3afbc82311a1640e20858c0cd7e065 upstream.

A test case is as the description says:
open(foobar, O_WRONLY);
sleep()  --> reboot the server
close(foobar)

The bug is because in nfs4state.c in nfs4_reclaim_open_state() a few
line before going to restart, there is
clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &state->flags).

NFS4CLNT_RECLAIM_NOGRACE is a flag for the client states not open
owner states. Value of NFS4CLNT_RECLAIM_NOGRACE is 4 which is the
value of NFS_O_WRONLY_STATE in nfs4_state->flags. So clearing it wipes
out state and when we go to close it, “call_close” doesn’t get set as
state flag is not set and CLOSE doesn’t go on the wire.

Signed-off-by: Olga Kornievskaia 
Signed-off-by: Trond Myklebust 
Signed-off-by: Zefan Li

vfs: Test for and handle paths that are unreachable from their mnt_root

2015-10-22T01:20:08+00:00

commit 397d425dc26da728396e66d392d5dcb8dac30c37 upstream.

In rare cases a directory can be renamed out from under a bind mount.
In those cases without special handling it becomes possible to walk up
the directory tree to the root dentry of the filesystem and down
from the root dentry to every other file or directory on the filesystem.

Like division by zero .. from an unconnected path can not be given
a useful semantic as there is no predicting at which path component
the code will realize it is unconnected.  We certainly can not match
the current behavior as the current behavior is a security hole.

Therefore when encounting .. when following an unconnected path
return -ENOENT.

- Add a function path_connected to verify path->dentry is reachable
  from path->mnt.mnt_root.  AKA to validate that rename did not do
  something nasty to the bind mount.

  To avoid races path_connected must be called after following a path
  component to it's next path component.

Signed-off-by: "Eric W. Biederman" 
Signed-off-by: Al Viro

dcache: Handle escaped paths in prepend_path

2015-10-22T01:20:08+00:00

commit cde93be45a8a90d8c264c776fab63487b5038a65 upstream.

A rename can result in a dentry that by walking up d_parent
will never reach it's mnt_root.  For lack of a better term
I call this an escaped path.

prepend_path is called by four different functions __d_path,
d_absolute_path, d_path, and getcwd.

__d_path only wants to see paths are connected to the root it passes
in.  So __d_path needs prepend_path to return an error.

d_absolute_path similarly wants to see paths that are connected to
some root.  Escaped paths are not connected to any mnt_root so
d_absolute_path needs prepend_path to return an error greater
than 1.  So escaped paths will be treated like paths on lazily
unmounted mounts.

getcwd needs to prepend "(unreachable)" so getcwd also needs
prepend_path to return an error.

d_path is the interesting hold out.  d_path just wants to print
something, and does not care about the weird cases.  Which raises
the question what should be printed?

Given that / should result in -ENOENT I
believe it is desirable for escaped paths to be printed as empty
paths.  As there are not really any meaninful path components when
considered from the perspective of a mount tree.

So tweak prepend_path to return an empty path with an new error
code of 3 when it encounters an escaped path.

Signed-off-by: "Eric W. Biederman" 
Signed-off-by: Al Viro 
Signed-off-by: Zefan Li

jbd2: avoid infinite loop when destroying aborted journal

2015-10-22T01:20:08+00:00

commit 841df7df196237ea63233f0f9eaa41db53afd70f upstream.

Commit 6f6a6fda2945 "jbd2: fix ocfs2 corrupt when updating journal
superblock fails" changed jbd2_cleanup_journal_tail() to return EIO
when the journal is aborted. That makes logic in
jbd2_log_do_checkpoint() bail out which is fine, except that
jbd2_journal_destroy() expects jbd2_log_do_checkpoint() to always make
a progress in cleaning the journal. Without it jbd2_journal_destroy()
just loops in an infinite loop.

Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
jbd2_log_do_checkpoint() fails with error.

Reported-by: Eryu Guan 
Tested-by: Eryu Guan 
Fixes: 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a
Signed-off-by: Jan Kara 
Signed-off-by: Theodore Ts'o 
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li

fuse: initialize fc->release before calling it

2015-10-22T01:20:07+00:00

commit 0ad0b3255a08020eaf50e34ef0d6df5bdf5e09ed upstream.

fc->release is called from fuse_conn_put() which was used in the error
cleanup before fc->release was initialized.

[Jeremiah Mahler : assign fc->release after calling
fuse_conn_init(fc) instead of before.]

Signed-off-by: Miklos Szeredi 
Fixes: a325f9b92273 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()")
Signed-off-by: Zefan Li

ext4: don't retry file block mapping on bigalloc fs with non-extent file

2015-10-22T01:20:05+00:00

commit 292db1bc6c105d86111e858859456bcb11f90f91 upstream.

ext4 isn't willing to map clusters to a non-extent file.  Don't signal
this with an out of space error, since the FS will retry the
allocation (which didn't fail) forever.  Instead, return EUCLEAN so
that the operation will fail immediately all the way back to userspace.

(The fix is either to run e2fsck -E bmap2extent, or to chattr +e the file.)

Signed-off-by: Darrick J. Wong 
Signed-off-by: Theodore Ts'o 
Signed-off-by: Zefan Li

ext4: call sync_blockdev() before invalidate_bdev() in put_super()

2015-10-22T01:20:05+00:00

commit 89d96a6f8e6491f24fc8f99fd6ae66820e85c6c1 upstream.

Normally all of the buffers will have been forced out to disk before
we call invalidate_bdev(), but there will be some cases, where a file
system operation was aborted due to an ext4_error(), where there may
still be some dirty buffers in the buffer cache for the device.  So
try to force them out to memory before calling invalidate_bdev().

This fixes a warning triggered by generic/081:

WARNING: CPU: 1 PID: 3473 at /usr/projects/linux/ext4/fs/block_dev.c:56 __blkdev_put+0xb5/0x16f()

Signed-off-by: Theodore Ts'o 
Signed-off-by: Zefan Li

jbd2: fix ocfs2 corrupt when updating journal superblock fails

2015-10-22T01:20:04+00:00

commit 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a upstream.

If updating journal superblock fails after journal data has been
flushed, the error is omitted and this will mislead the caller as a
normal case.  In ocfs2, the checkpoint will be treated successfully
and the other node can get the lock to update. Since the sb_start is
still pointing to the old log block, it will rewrite the journal data
during journal recovery by the other node. Thus the new updates will
be overwritten and ocfs2 corrupts.  So in above case we have to return
the error, and ocfs2_commit_cache will take care of the error and
prevent the other node to do update first.  And only after recovering
journal it can do the new updates.

The issue discussion mail can be found at:
https://oss.oracle.com/pipermail/ocfs2-devel/2015-June/010856.html
http://comments.gmane.org/gmane.comp.file-systems.ext4/48841

[ Fixed bug in patch which allowed a non-negative error return from
  jbd2_cleanup_journal_tail() to leak out of jbd2_fjournal_flush(); this
  was causing xfstests ext4/306 to fail. -- Ted ]

Reported-by: Yiwen Jiang 
Signed-off-by: Joseph Qi 
Signed-off-by: Theodore Ts'o 
Tested-by: Yiwen Jiang 
Cc: Junxiao Bi 
Signed-off-by: Zefan Li

jbd2: use GFP_NOFS in jbd2_cleanup_journal_tail()

2015-10-22T01:20:04+00:00

commit b4f1afcd068f6e533230dfed00782cd8a907f96b upstream.

jbd2_cleanup_journal_tail() can be invoked by jbd2__journal_start()
So allocations should be done with GFP_NOFS

[Full stack trace snipped from 3.10-rh7]
[] dump_stack+0x19/0x1b
[] warn_slowpath_common+0x61/0x80
[] warn_slowpath_null+0x1a/0x20
[] slab_pre_alloc_hook.isra.31.part.32+0x15/0x17
[] kmem_cache_alloc+0x55/0x210
[] ? mempool_alloc_slab+0x15/0x20
[] mempool_alloc_slab+0x15/0x20
[] mempool_alloc+0x69/0x170
[] ? _raw_spin_unlock_irq+0xe/0x20
[] ? finish_task_switch+0x5d/0x150
[] bio_alloc_bioset+0x1be/0x2e0
[] blkdev_issue_flush+0x99/0x120
[] jbd2_cleanup_journal_tail+0x93/0xa0 [jbd2] -->GFP_KERNEL
[] jbd2_log_do_checkpoint+0x221/0x4a0 [jbd2]
[] __jbd2_log_wait_for_space+0xa7/0x1e0 [jbd2]
[] start_this_handle+0x2d8/0x550 [jbd2]
[] ? __memcg_kmem_put_cache+0x29/0x30
[] ? kmem_cache_alloc+0x130/0x210
[] jbd2__journal_start+0xba/0x190 [jbd2]
[] ? lru_cache_add+0xe/0x10
[] ? ext4_da_write_begin+0xf9/0x330 [ext4]
[] __ext4_journal_start_sb+0x77/0x160 [ext4]
[] ext4_da_write_begin+0xf9/0x330 [ext4]
[] generic_file_buffered_write_iter+0x10c/0x270
[] __generic_file_write_iter+0x178/0x390
[] __generic_file_aio_write+0x8b/0xb0
[] generic_file_aio_write+0x5d/0xc0
[] ext4_file_write+0xa9/0x450 [ext4]
[] ? pipe_read+0x379/0x4f0
[] do_sync_write+0x90/0xe0
[] vfs_write+0xbd/0x1e0
[] SyS_write+0x58/0xb0
[] system_call_fastpath+0x16/0x1b

Signed-off-by: Dmitry Monakhov 
Signed-off-by: Theodore Ts'o 
Signed-off-by: Zefan Li

ext4: fix race between truncate and __ext4_journalled_writepage()

2015-10-22T01:20:04+00:00

commit bdf96838aea6a265f2ae6cbcfb12a778c84a0b8e upstream.

The commit cf108bca465d: "ext4: Invert the locking order of page_lock
and transaction start" caused __ext4_journalled_writepage() to drop
the page lock before the page was written back, as part of changing
the locking order to jbd2_journal_start -> page_lock.  However, this
introduced a potential race if there was a truncate racing with the
data=journalled writeback mode.

Fix this by grabbing the page lock after starting the journal handle,
and then checking to see if page had gotten truncated out from under
us.

This fixes a number of different warnings or BUG_ON's when running
xfstests generic/086 in data=journalled mode, including:

jbd2_journal_dirty_metadata: vdc-8: bad jh for block 115643: transaction (ee3fe7
c0, 164), jh->b_transaction (  (null), 0), jh->b_next_transaction (  (null), 0), jlist 0

	      	      	  - and -

kernel BUG at /usr/projects/linux/ext4/fs/jbd2/transaction.c:2200!
    ...
Call Trace:
 [] ? __ext4_journalled_invalidatepage+0x117/0x117
 [] __ext4_journalled_invalidatepage+0x10f/0x117
 [] ? __ext4_journalled_invalidatepage+0x117/0x117
 [] ? lock_buffer+0x36/0x36
 [] ext4_journalled_invalidatepage+0xd/0x22
 [] do_invalidatepage+0x22/0x26
 [] truncate_inode_page+0x5b/0x85
 [] truncate_inode_pages_range+0x156/0x38c
 [] truncate_inode_pages+0x11/0x15
 [] truncate_pagecache+0x55/0x71
 [] ext4_setattr+0x4a9/0x560
 [] ? current_kernel_time+0x10/0x44
 [] notify_change+0x1c7/0x2be
 [] do_truncate+0x65/0x85
 [] ? file_ra_state_init+0x12/0x29

	      	      	  - and -

WARNING: CPU: 1 PID: 1331 at /usr/projects/linux/ext4/fs/jbd2/transaction.c:1396
irty_metadata+0x14a/0x1ae()
    ...
Call Trace:
 [] ? console_unlock+0x3a1/0x3ce
 [] dump_stack+0x48/0x60
 [] warn_slowpath_common+0x89/0xa0
 [] ? jbd2_journal_dirty_metadata+0x14a/0x1ae
 [] warn_slowpath_null+0x14/0x18
 [] jbd2_journal_dirty_metadata+0x14a/0x1ae
 [] __ext4_handle_dirty_metadata+0xd4/0x19d
 [] write_end_fn+0x40/0x53
 [] ext4_walk_page_buffers+0x4e/0x6a
 [] ext4_writepage+0x354/0x3b8
 [] ? mpage_release_unused_pages+0xd4/0xd4
 [] ? wait_on_buffer+0x2c/0x2c
 [] ? ext4_writepage+0x3b8/0x3b8
 [] __writepage+0x10/0x2e
 [] write_cache_pages+0x22d/0x32c
 [] ? ext4_writepage+0x3b8/0x3b8
 [] ext4_writepages+0x102/0x607
 [] ? sched_clock_local+0x10/0x10e
 [] ? __lock_is_held+0x2e/0x44
 [] ? lock_is_held+0x43/0x51
 [] do_writepages+0x1c/0x29
 [] __writeback_single_inode+0xc3/0x545
 [] writeback_sb_inodes+0x21f/0x36d
    ...

Signed-off-by: Theodore Ts'o 
Signed-off-by: Zefan Li