linux.git/fs/proc_namespace.c, branch v4.9

vfs: show_vfsstat: do not ignore errors from show_devname method

2016-03-16T17:09:08+00:00

Explicitly check show_devname method return code and bail out in case
of an error.  This fixes regression introduced by commit 9d4d65748a5c.

Cc: stable@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
Signed-off-by: Al Viro

vfs: show_vfsstat: remove redundant initialization and check of error code

2015-12-07T02:17:16+00:00

As err variable is now always checked right after each assignment, its
initialization is redundant and could be safely removed.  For the same
reason, the last check of err is also redundant and could be removed as
well.

Signed-off-by: Dmitry V. Levin 
Signed-off-by: Al Viro

vfs: show_mountinfo: cleanup error code checks

2015-12-07T02:17:15+00:00

Check err variable right after each assignment.  This change makes
initialization of err redundant, so remove the initialization.

Signed-off-by: Dmitry V. Levin 
Signed-off-by: Al Viro

vfs: show_vfsmnt: remove redundant initialization of error code

2015-12-07T02:17:15+00:00

As err variable is now always checked right after the first assignment,
its initialization is redundant and could be safely removed.

Signed-off-by: Dmitry V. Levin 
Signed-off-by: Al Viro

fs: use seq_open_private() for proc_mounts

2015-07-01T02:44:56+00:00

A patchset to remove support for passing pre-allocated struct seq_file to
seq_open().  Such feature is undocumented and prone to error.

In particular, if seq_release() is used in release handler, it will
kfree() a pointer which was not allocated by seq_open().

So this patchset drops support for pre-allocated struct seq_file: it's
only of use in proc_namespace.c and can be easily replaced by using
seq_open_private()/seq_release_private().

Additionally, it documents the use of file->private_data to hold pointer
to struct seq_file by seq_open().

This patch (of 3):

Since patch described below, from v2.6.15-rc1, seq_open() could use a
struct seq_file already allocated by the caller if the pointer to the
structure is stored in file->private_data before calling the function.

    Commit 1abe77b0fc4b485927f1f798ae81a752677e1d05
    Author: Al Viro 
    Date:   Mon Nov 7 17:15:34 2005 -0500

        [PATCH] allow callers of seq_open do allocation themselves

        Allow caller of seq_open() to kmalloc() seq_file + whatever else they
        want and set ->private_data to it.  seq_open() will then abstain from
        doing allocation itself.

Such behavior is only used by mounts_open_common().

In order to drop support for such uncommon feature, proc_mounts is
converted to use seq_open_private(), which take care of allocating the
proc_mounts structure, making it available through ->private in struct
seq_file.

Conversely, proc_mounts is converted to use seq_release_private(), in
order to release the private structure allocated by seq_open_private().

Then, ->private is used directly instead of proc_mounts() macro to access
to the proc_mounts structure.

Link: http://lkml.kernel.org/r/cover.1433193673.git.ydroneaud@opteya.com
Signed-off-by: Yann Droneaud 
Cc: Al Viro 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

vfs: add support for a lazytime mount option

2015-02-05T07:45:00+00:00

Add a new mount option which enables a new "lazytime" mode.  This mode
causes atime, mtime, and ctime updates to only be made to the
in-memory version of the inode.  The on-disk times will only get
updated when (a) if the inode needs to be updated for some non-time
related change, (b) if userspace calls fsync(), syncfs() or sync(), or
(c) just before an undeleted inode is evicted from memory.

This is OK according to POSIX because there are no guarantees after a
crash unless userspace explicitly requests via a fsync(2) call.

For workloads which feature a large number of random write to a
preallocated file, the lazytime mount option significantly reduces
writes to the inode table.  The repeated 4k writes to a single block
will result in undesirable stress on flash devices and SMR disk
drives.  Even on conventional HDD's, the repeated writes to the inode
table block will trigger Adjacent Track Interference (ATI) remediation
latencies, which very negatively impact long tail latencies --- which
is a very big deal for web serving tiers (for example).

Google-Bug-Id: 18297052

Signed-off-by: Theodore Ts'o 
Signed-off-by: Al Viro

vfs: make mounts and mountstats honor root dir like mountinfo does

2014-12-17T13:27:15+00:00

As we already show mountpoints relative to the root directory, thanks
to the change made back in 2000, change show_vfsmnt() and show_vfsstat()
to skip out-of-root mountpoints the same way as show_mountinfo() does.

Signed-off-by: Dmitry V. Levin 
Signed-off-by: Al Viro

vfs: cleanup show_mountinfo

2014-12-17T13:27:15+00:00

Starting with commit v3.2-rc4-1-g02125a8, seq_path_root() no longer
changes the value of its "struct path *root" argument.
Starting with commit v3.2-rc7-104-g8c9379e, the "struct path *root"
argument of seq_path_root() is const.
As result, the temporary variable "root" in show_mountinfo() that
holds a copy of struct path root is no longer needed.

Signed-off-by: Dmitry V. Levin 
Signed-off-by: Al Viro

namespaces: Use task_lock and not rcu to protect nsproxy

2014-07-30T01:08:50+00:00

The synchronous syncrhonize_rcu in switch_task_namespaces makes setns
a sufficiently expensive system call that people have complained.

Upon inspect nsproxy no longer needs rcu protection for remote reads.
remote reads are rare.  So optimize for same process reads and write
by switching using rask_lock instead.

This yields a simpler to understand lock, and a faster setns system call.

In particular this fixes a performance regression observed
by Rafael David Tinoco .

This is effectively a revert of Pavel Emelyanov's commit
cf7b708c8d1d7a27736771bcf4c457b332b0f818 Make access to task's nsproxy lighter
from 2007.  The race this originialy fixed no longer exists as
do_notify_parent uses task_active_pid_ns(parent) instead of
parent->nsproxy.

Signed-off-by: "Eric W. Biederman"

reduce m_start() cost...

2014-04-02T03:19:09+00:00

Signed-off-by: Al Viro