<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/dcache.c, branch v4.12</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Hang/soft lockup in d_invalidate with simultaneous calls</title>
<updated>2017-06-15T10:52:09+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@ZenIV.linux.org.uk</email>
</author>
<published>2017-06-03T06:20:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=81be24d263dbeddaba35827036d6f6787a59c2c3'/>
<id>81be24d263dbeddaba35827036d6f6787a59c2c3</id>
<content type='text'>
It's not hard to trigger a bunch of d_invalidate() on the same
dentry in parallel.  They end up fighting each other - any
dentry picked for removal by one will be skipped by the rest
and we'll go for the next iteration through the entire
subtree, even if everything is being skipped.  Morevoer, we
immediately go back to scanning the subtree.  The only thing
we really need is to dissolve all mounts in the subtree and
as soon as we've nothing left to do, we can just unhash the
dentry and bugger off.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's not hard to trigger a bunch of d_invalidate() on the same
dentry in parallel.  They end up fighting each other - any
dentry picked for removal by one will be skipped by the rest
and we'll go for the next iteration through the entire
subtree, even if everything is being skipped.  Morevoer, we
immediately go back to scanning the subtree.  The only thing
we really need is to dissolve all mounts in the subtree and
as soon as we've nothing left to do, we can just unhash the
dentry and bugger off.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: don't set *REFERENCED on single use objects</title>
<updated>2017-05-03T15:47:05+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2017-04-18T20:04:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=563f40019d16e5299e9edd3eac37846760637076'/>
<id>563f40019d16e5299e9edd3eac37846760637076</id>
<content type='text'>
By default we set DCACHE_REFERENCED and I_REFERENCED on any dentry or
inode we create.  This is problematic as this means that it takes two
trips through the LRU for any of these objects to be reclaimed,
regardless of their actual lifetime.  With enough pressure from these
caches we can easily evict our working set from page cache with single
use objects.  So instead only set *REFERENCED if we've already been
added to the LRU list.  This means that we've been touched since the
first time we were accessed, and so more likely to need to hang out in
cache.

To illustrate this issue I wrote the following scripts

https://github.com/josefbacik/debug-scripts/tree/master/cache-pressure

on my test box.  It is a single socket 4 core CPU with 16gib of RAM and
I tested on an Intel 2tib NVME drive.  The cache-pressure.sh script
creates a new file system and creates 2 6.5gib files in order to take up
13gib of the 16gib of ram with pagecache.  Then it runs a test program
that reads these 2 files in a loop, and keeps track of how often it has
to read bytes for each loop.  On an ideal system with no pressure we
should have to read 0 bytes indefinitely.  The second thing this script
does is start a fs_mark job that creates a ton of 0 length files,
putting pressure on the system with slab only allocations.  On exit the
script prints out how many bytes were read by the read-file program.
The results are as follows

Without patch:
/mnt/btrfs-test/reads/file1: total read during loops 27262988288
/mnt/btrfs-test/reads/file2: total read during loops 27262976000

With patch:
/mnt/btrfs-test/reads/file2: total read during loops 18640457728
/mnt/btrfs-test/reads/file1: total read during loops 9565376512

This patch results in a 50% reduction of the amount of pages evicted
from our working set.

Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
By default we set DCACHE_REFERENCED and I_REFERENCED on any dentry or
inode we create.  This is problematic as this means that it takes two
trips through the LRU for any of these objects to be reclaimed,
regardless of their actual lifetime.  With enough pressure from these
caches we can easily evict our working set from page cache with single
use objects.  So instead only set *REFERENCED if we've already been
added to the LRU list.  This means that we've been touched since the
first time we were accessed, and so more likely to need to hang out in
cache.

To illustrate this issue I wrote the following scripts

https://github.com/josefbacik/debug-scripts/tree/master/cache-pressure

on my test box.  It is a single socket 4 core CPU with 16gib of RAM and
I tested on an Intel 2tib NVME drive.  The cache-pressure.sh script
creates a new file system and creates 2 6.5gib files in order to take up
13gib of the 16gib of ram with pagecache.  Then it runs a test program
that reads these 2 files in a loop, and keeps track of how often it has
to read bytes for each loop.  On an ideal system with no pressure we
should have to read 0 bytes indefinitely.  The second thing this script
does is start a fs_mark job that creates a ton of 0 length files,
putting pressure on the system with slab only allocations.  On exit the
script prints out how many bytes were read by the read-file program.
The results are as follows

Without patch:
/mnt/btrfs-test/reads/file1: total read during loops 27262988288
/mnt/btrfs-test/reads/file2: total read during loops 27262976000

With patch:
/mnt/btrfs-test/reads/file2: total read during loops 18640457728
/mnt/btrfs-test/reads/file1: total read during loops 9565376512

This patch results in a 50% reduction of the amount of pages evicted
from our working set.

Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mnt: Protect the mountpoint hashtable with mount_lock</title>
<updated>2017-01-10T00:34:43+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2017-01-03T01:18:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3895dbf8985f656675b5bde610723a29cbce3fa7'/>
<id>3895dbf8985f656675b5bde610723a29cbce3fa7</id>
<content type='text'>
Protecting the mountpoint hashtable with namespace_sem was sufficient
until a call to umount_mnt was added to mntput_no_expire.  At which
point it became possible for multiple calls of put_mountpoint on
the same hash chain to happen on the same time.

Kristen Johansen &lt;kjlx@templeofstupid.com&gt; reported:
&gt; This can cause a panic when simultaneous callers of put_mountpoint
&gt; attempt to free the same mountpoint.  This occurs because some callers
&gt; hold the mount_hash_lock, while others hold the namespace lock.  Some
&gt; even hold both.
&gt;
&gt; In this submitter's case, the panic manifested itself as a GP fault in
&gt; put_mountpoint() when it called hlist_del() and attempted to dereference
&gt; a m_hash.pprev that had been poisioned by another thread.

Al Viro observed that the simple fix is to switch from using the namespace_sem
to the mount_lock to protect the mountpoint hash table.

I have taken Al's suggested patch moved put_mountpoint in pivot_root
(instead of taking mount_lock an additional time), and have replaced
new_mountpoint with get_mountpoint a function that does the hash table
lookup and addition under the mount_lock.   The introduction of get_mounptoint
ensures that only the mount_lock is needed to manipulate the mountpoint
hashtable.

d_set_mounted is modified to only set DCACHE_MOUNTED if it is not
already set.  This allows get_mountpoint to use the setting of
DCACHE_MOUNTED to ensure adding a struct mountpoint for a dentry
happens exactly once.

Cc: stable@vger.kernel.org
Fixes: ce07d891a089 ("mnt: Honor MNT_LOCKED when detaching mounts")
Reported-by: Krister Johansen &lt;kjlx@templeofstupid.com&gt;
Suggested-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Acked-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Protecting the mountpoint hashtable with namespace_sem was sufficient
until a call to umount_mnt was added to mntput_no_expire.  At which
point it became possible for multiple calls of put_mountpoint on
the same hash chain to happen on the same time.

Kristen Johansen &lt;kjlx@templeofstupid.com&gt; reported:
&gt; This can cause a panic when simultaneous callers of put_mountpoint
&gt; attempt to free the same mountpoint.  This occurs because some callers
&gt; hold the mount_hash_lock, while others hold the namespace lock.  Some
&gt; even hold both.
&gt;
&gt; In this submitter's case, the panic manifested itself as a GP fault in
&gt; put_mountpoint() when it called hlist_del() and attempted to dereference
&gt; a m_hash.pprev that had been poisioned by another thread.

Al Viro observed that the simple fix is to switch from using the namespace_sem
to the mount_lock to protect the mountpoint hash table.

I have taken Al's suggested patch moved put_mountpoint in pivot_root
(instead of taking mount_lock an additional time), and have replaced
new_mountpoint with get_mountpoint a function that does the hash table
lookup and addition under the mount_lock.   The introduction of get_mounptoint
ensures that only the mount_lock is needed to manipulate the mountpoint
hashtable.

d_set_mounted is modified to only set DCACHE_MOUNTED if it is not
already set.  This allows get_mountpoint to use the setting of
DCACHE_MOUNTED to ensure adding a struct mountpoint for a dentry
happens exactly once.

Cc: stable@vger.kernel.org
Fixes: ce07d891a089 ("mnt: Honor MNT_LOCKED when detaching mounts")
Reported-by: Krister Johansen &lt;kjlx@templeofstupid.com&gt;
Suggested-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Acked-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Replace &lt;asm/uaccess.h&gt; with &lt;linux/uaccess.h&gt; globally</title>
<updated>2016-12-24T19:46:01+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-12-24T19:46:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7c0f6ba682b9c7632072ffbedf8d328c8f3c42ba'/>
<id>7c0f6ba682b9c7632072ffbedf8d328c8f3c42ba</id>
<content type='text'>
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*&lt;asm/uaccess.h&gt;'
  sed -i -e "s!$PATT!#include &lt;linux/uaccess.h&gt;!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*&lt;asm/uaccess.h&gt;'
  sed -i -e "s!$PATT!#include &lt;linux/uaccess.h&gt;!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: remove unused have_submounts() function</title>
<updated>2016-12-04T01:51:49+00:00</updated>
<author>
<name>Ian Kent</name>
<email>ikent@redhat.com</email>
</author>
<published>2016-11-23T21:03:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f74e7b33c37e5a7bae33bb73858c2766cb256626'/>
<id>f74e7b33c37e5a7bae33bb73858c2766cb256626</id>
<content type='text'>
Now that path_has_submounts() has been added have_submounts() is no
longer used so remove it.

Link: http://lkml.kernel.org/r/20161011053428.27645.12310.stgit@pluto.themaw.net
Signed-off-by: Ian Kent &lt;raven@themaw.net&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Omar Sandoval &lt;osandov@osandov.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that path_has_submounts() has been added have_submounts() is no
longer used so remove it.

Link: http://lkml.kernel.org/r/20161011053428.27645.12310.stgit@pluto.themaw.net
Signed-off-by: Ian Kent &lt;raven@themaw.net&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Omar Sandoval &lt;osandov@osandov.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: add path_has_submounts()</title>
<updated>2016-12-04T01:51:47+00:00</updated>
<author>
<name>Ian Kent</name>
<email>ikent@redhat.com</email>
</author>
<published>2016-11-23T21:03:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=01619491a5f0766014fe863c5ae480665436e7a2'/>
<id>01619491a5f0766014fe863c5ae480665436e7a2</id>
<content type='text'>
d_mountpoint() can only be used reliably to establish if a dentry is
not mounted in any namespace. It isn't aware of the possibility there
may be multiple mounts using the given dentry, possibly in a different
namespace.

Add function, path_has_submounts(), that checks is a struct path contains
mounts (or is a mountpoint itself) to handle this case.

Link: http://lkml.kernel.org/r/20161011053403.27645.55242.stgit@pluto.themaw.net
Signed-off-by: Ian Kent &lt;raven@themaw.net&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Omar Sandoval &lt;osandov@osandov.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
d_mountpoint() can only be used reliably to establish if a dentry is
not mounted in any namespace. It isn't aware of the possibility there
may be multiple mounts using the given dentry, possibly in a different
namespace.

Add function, path_has_submounts(), that checks is a struct path contains
mounts (or is a mountpoint itself) to handle this case.

Link: http://lkml.kernel.org/r/20161011053403.27645.55242.stgit@pluto.themaw.net
Signed-off-by: Ian Kent &lt;raven@themaw.net&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Omar Sandoval &lt;osandov@osandov.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2016-08-07T14:01:14+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-08-07T14:01:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fe64f3283fb315e3d8f2b78785a86904a852ca82'/>
<id>fe64f3283fb315e3d8f2b78785a86904a852ca82</id>
<content type='text'>
Pull more vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  In the "trivial API change" department - -&gt;d_compare() losing 'parent'
  argument"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  cachefiles: Fix race between inactivating and culling a cache object
  9p: use clone_fid()
  9p: fix braino introduced in "9p: new helper - v9fs_parent_fid()"
  vfs: make dentry_needs_remove_privs() internal
  vfs: remove file_needs_remove_privs()
  vfs: fix deadlock in file_remove_privs() on overlayfs
  get rid of 'parent' argument of -&gt;d_compare()
  cifs, msdos, vfat, hfs+: don't bother with parent in -&gt;d_compare()
  affs -&gt;d_compare(): don't bother with -&gt;d_inode
  fold _d_rehash() and __d_rehash() together
  fold dentry_rcuwalk_invalidate() into its only remaining caller
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull more vfs updates from Al Viro:
 "Assorted cleanups and fixes.

  In the "trivial API change" department - -&gt;d_compare() losing 'parent'
  argument"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  cachefiles: Fix race between inactivating and culling a cache object
  9p: use clone_fid()
  9p: fix braino introduced in "9p: new helper - v9fs_parent_fid()"
  vfs: make dentry_needs_remove_privs() internal
  vfs: remove file_needs_remove_privs()
  vfs: fix deadlock in file_remove_privs() on overlayfs
  get rid of 'parent' argument of -&gt;d_compare()
  cifs, msdos, vfat, hfs+: don't bother with parent in -&gt;d_compare()
  affs -&gt;d_compare(): don't bother with -&gt;d_inode
  fold _d_rehash() and __d_rehash() together
  fold dentry_rcuwalk_invalidate() into its only remaining caller
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2016-08-06T13:49:02+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-08-06T13:49:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=835c92d43b29eb354abdbd5475308a474d7efdfa'/>
<id>835c92d43b29eb354abdbd5475308a474d7efdfa</id>
<content type='text'>
Pull qstr constification updates from Al Viro:
 "Fairly self-contained bunch - surprising lot of places passes struct
  qstr * as an argument when const struct qstr * would suffice; it
  complicates analysis for no good reason.

  I'd prefer to feed that separately from the assorted fixes (those are
  in #for-linus and with somewhat trickier topology)"

* 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  qstr: constify instances in adfs
  qstr: constify instances in lustre
  qstr: constify instances in f2fs
  qstr: constify instances in ext2
  qstr: constify instances in vfat
  qstr: constify instances in procfs
  qstr: constify instances in fuse
  qstr constify instances in fs/dcache.c
  qstr: constify instances in nfs
  qstr: constify instances in ocfs2
  qstr: constify instances in autofs4
  qstr: constify instances in hfs
  qstr: constify instances in hfsplus
  qstr: constify instances in logfs
  qstr: constify dentry_init_security
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull qstr constification updates from Al Viro:
 "Fairly self-contained bunch - surprising lot of places passes struct
  qstr * as an argument when const struct qstr * would suffice; it
  complicates analysis for no good reason.

  I'd prefer to feed that separately from the assorted fixes (those are
  in #for-linus and with somewhat trickier topology)"

* 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  qstr: constify instances in adfs
  qstr: constify instances in lustre
  qstr: constify instances in f2fs
  qstr: constify instances in ext2
  qstr: constify instances in vfat
  qstr: constify instances in procfs
  qstr: constify instances in fuse
  qstr constify instances in fs/dcache.c
  qstr: constify instances in nfs
  qstr: constify instances in ocfs2
  qstr: constify instances in autofs4
  qstr: constify instances in hfs
  qstr: constify instances in hfsplus
  qstr: constify instances in logfs
  qstr: constify dentry_init_security
</pre>
</div>
</content>
</entry>
<entry>
<title>get rid of 'parent' argument of -&gt;d_compare()</title>
<updated>2016-07-31T20:37:25+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2016-07-31T20:37:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6fa67e707559303e086303aeecc9e8b91ef497d5'/>
<id>6fa67e707559303e086303aeecc9e8b91ef497d5</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fold _d_rehash() and __d_rehash() together</title>
<updated>2016-07-29T21:45:21+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2016-07-29T21:45:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=15d3c589f6305c50a705572dbb33620c5cba416c'/>
<id>15d3c589f6305c50a705572dbb33620c5cba416c</id>
<content type='text'>
The only place where we feed to __d_rehash() something other than
d_hash(dentry-&gt;d_name.hash) is __d_move(), where we give it d_hash
of another dentry.  Postpone rehashing until we'd switched the
names and we are rid of that exception, along with the need to
keep _d_rehash() and __d_rehash() separate.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The only place where we feed to __d_rehash() something other than
d_hash(dentry-&gt;d_name.hash) is __d_move(), where we give it d_hash
of another dentry.  Postpone rehashing until we'd switched the
names and we are rid of that exception, along with the need to
keep _d_rehash() and __d_rehash() separate.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
</feed>
