<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/dcache.c, branch v3.11</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>vfs: make the dentry cache use the lockref infrastructure</title>
<updated>2013-08-29T01:24:59+00:00</updated>
<author>
<name>Waiman Long</name>
<email>Waiman.Long@hp.com</email>
</author>
<published>2013-08-29T01:24:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=98474236f72e5a8b89c14cd7c74f0bb77a4b1a99'/>
<id>98474236f72e5a8b89c14cd7c74f0bb77a4b1a99</id>
<content type='text'>
This just replaces the dentry count/lock combination with the lockref
structure that contains both a count and a spinlock, and does the
mechanical conversion to use the lockref infrastructure.

There are no semantic changes here, it's purely syntactic.  The
reference lockref implementation uses the spinlock exactly the same way
that the old dcache code did, and the bulk of this patch is just
expanding the internal "d_count" use in the dcache code to use
"d_lockref.count" instead.

This is purely preparation for the real change to make the reference
count updates be lockless during the 3.12 merge window.

[ As with the previous commit, this is a rewritten version of a concept
  originally from Waiman, so credit goes to him, blame for any errors
  goes to me.

  Waiman's patch had some semantic differences for taking advantage of
  the lockless update in dget_parent(), while this patch is
  intentionally a pure search-and-replace change with no semantic
  changes.     - Linus ]

Signed-off-by: Waiman Long &lt;Waiman.Long@hp.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This just replaces the dentry count/lock combination with the lockref
structure that contains both a count and a spinlock, and does the
mechanical conversion to use the lockref infrastructure.

There are no semantic changes here, it's purely syntactic.  The
reference lockref implementation uses the spinlock exactly the same way
that the old dcache code did, and the bulk of this patch is just
expanding the internal "d_count" use in the dcache code to use
"d_lockref.count" instead.

This is purely preparation for the real change to make the reference
count updates be lockless during the 3.12 merge window.

[ As with the previous commit, this is a rewritten version of a concept
  originally from Waiman, so credit goes to him, blame for any errors
  goes to me.

  Waiman's patch had some semantic differences for taking advantage of
  the lockless update in dget_parent(), while this patch is
  intentionally a pure search-and-replace change with no semantic
  changes.     - Linus ]

Signed-off-by: Waiman Long &lt;Waiman.Long@hp.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cope with potentially long -&gt;d_dname() output for shmem/hugetlb</title>
<updated>2013-08-24T16:10:17+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2013-08-24T16:08:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=118b23022512eb2f41ce42db70dc0568d00be4ba'/>
<id>118b23022512eb2f41ce42db70dc0568d00be4ba</id>
<content type='text'>
dynamic_dname() is both too much and too little for those - the
output may be well in excess of 64 bytes dynamic_dname() assumes
to be enough (thanks to ashmem feeding really long names to
shmem_file_setup()) and vsnprintf() is an overkill for those
guys.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
dynamic_dname() is both too much and too little for those - the
output may be well in excess of 64 bytes dynamic_dname() assumes
to be enough (thanks to ashmem feeding really long names to
shmem_file_setup()) and vsnprintf() is an overkill for those
guys.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2013-07-03T16:10:19+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-07-03T16:10:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=790eac5640abf7a57fa3a644386df330e18c11b0'/>
<id>790eac5640abf7a57fa3a644386df330e18c11b0</id>
<content type='text'>
Pull second set of VFS changes from Al Viro:
 "Assorted f_pos race fixes, making do_splice_direct() safe to call with
  i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
  -&gt;d_hash/-&gt;d_compare calling conventions changes from Linus, misc
  stuff all over the place."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  Document -&gt;tmpfile()
  ext4: -&gt;tmpfile() support
  vfs: export lseek_execute() to modules
  lseek_execute() doesn't need an inode passed to it
  block_dev: switch to fixed_size_llseek()
  cpqphp_sysfs: switch to fixed_size_llseek()
  tile-srom: switch to fixed_size_llseek()
  proc_powerpc: switch to fixed_size_llseek()
  ubi/cdev: switch to fixed_size_llseek()
  pci/proc: switch to fixed_size_llseek()
  isapnp: switch to fixed_size_llseek()
  lpfc: switch to fixed_size_llseek()
  locks: give the blocked_hash its own spinlock
  locks: add a new "lm_owner_key" lock operation
  locks: turn the blocked_list into a hashtable
  locks: convert fl_link to a hlist_node
  locks: avoid taking global lock if possible when waking up blocked waiters
  locks: protect most of the file_lock handling with i_lock
  locks: encapsulate the fl_link list handling
  locks: make "added" in __posix_lock_file a bool
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull second set of VFS changes from Al Viro:
 "Assorted f_pos race fixes, making do_splice_direct() safe to call with
  i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
  -&gt;d_hash/-&gt;d_compare calling conventions changes from Linus, misc
  stuff all over the place."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  Document -&gt;tmpfile()
  ext4: -&gt;tmpfile() support
  vfs: export lseek_execute() to modules
  lseek_execute() doesn't need an inode passed to it
  block_dev: switch to fixed_size_llseek()
  cpqphp_sysfs: switch to fixed_size_llseek()
  tile-srom: switch to fixed_size_llseek()
  proc_powerpc: switch to fixed_size_llseek()
  ubi/cdev: switch to fixed_size_llseek()
  pci/proc: switch to fixed_size_llseek()
  isapnp: switch to fixed_size_llseek()
  lpfc: switch to fixed_size_llseek()
  locks: give the blocked_hash its own spinlock
  locks: add a new "lm_owner_key" lock operation
  locks: turn the blocked_list into a hashtable
  locks: convert fl_link to a hlist_node
  locks: avoid taking global lock if possible when waking up blocked waiters
  locks: protect most of the file_lock handling with i_lock
  locks: encapsulate the fl_link list handling
  locks: make "added" in __posix_lock_file a bool
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Don't pass inode to -&gt;d_hash() and -&gt;d_compare()</title>
<updated>2013-06-29T08:57:36+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-05-21T22:22:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=da53be12bbb4fabbe2e9f6f908de0cf478b5161d'/>
<id>da53be12bbb4fabbe2e9f6f908de0cf478b5161d</id>
<content type='text'>
Instances either don't look at it at all (the majority of cases) or
only want it to find the superblock (which can be had as dentry-&gt;d_sb).
A few cases that want more are actually safe with dentry-&gt;d_inode -
the only precaution needed is the check that it hadn't been replaced with
NULL by rmdir() or by overwriting rename(), which case should be simply
treated as cache miss.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instances either don't look at it at all (the majority of cases) or
only want it to find the superblock (which can be had as dentry-&gt;d_sb).
A few cases that want more are actually safe with dentry-&gt;d_inode -
the only precaution needed is the check that it hadn't been replaced with
NULL by rmdir() or by overwriting rename(), which case should be simply
treated as cache miss.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kill find_inode_number()</title>
<updated>2013-06-29T08:57:20+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2013-06-15T07:37:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0b3fca1fd1499f0f5a7486d494f96538f2b7e5b9'/>
<id>0b3fca1fd1499f0f5a7486d494f96538f2b7e5b9</id>
<content type='text'>
the only remaining caller (in ncpfs) is guaranteed to return 0 -
we only hit it if we'd just checked that there's no dentry with
such name.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
the only remaining caller (in ncpfs) is guaranteed to return 0 -
we only hit it if we'd just checked that there's no dentry with
such name.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[O_TMPFILE] it's still short a few helpers, but infrastructure should be OK now...</title>
<updated>2013-06-29T08:57:10+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2013-06-07T05:20:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=60545d0d4610b02e55f65d141c95b18ccf855b6e'/>
<id>60545d0d4610b02e55f65d141c95b18ccf855b6e</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>GFS2: Add atomic_open support</title>
<updated>2013-06-14T10:17:15+00:00</updated>
<author>
<name>Steven Whitehouse</name>
<email>swhiteho@redhat.com</email>
</author>
<published>2013-06-14T10:17:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6d4ade986f9c8df31e68fd30643997f79cc5a5f8'/>
<id>6d4ade986f9c8df31e68fd30643997f79cc5a5f8</id>
<content type='text'>
I've restricted atomic_open to only operate on regular files, although
I still don't understand why atomic_open should not be possible also for
directories on GFS2. That can always be added in later though, if it
makes sense.

The -&gt;atomic_open function can be passed negative dentries, which
in most cases means either ENOENT (-&gt;lookup) or a call to d_instantiate
(-&gt;create). In the GFS2 case though, we need to actually perform the
look up, since we do not know whether there has been a new inode created
on another node. The look up calls d_splice_alias which then tries to
rehash the dentry - so the solution here is to simply check for that
in d_splice_alias. The same issue is likely to affect any other cluster
filesystem implementing -&gt;atomic_open

Signed-off-by: Steven Whitehouse &lt;swhiteho@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "J. Bruce Fields" &lt;bfields fieldses org&gt;
Cc: Jeff Layton &lt;jlayton@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I've restricted atomic_open to only operate on regular files, although
I still don't understand why atomic_open should not be possible also for
directories on GFS2. That can always be added in later though, if it
makes sense.

The -&gt;atomic_open function can be passed negative dentries, which
in most cases means either ENOENT (-&gt;lookup) or a call to d_instantiate
(-&gt;create). In the GFS2 case though, we need to actually perform the
look up, since we do not know whether there has been a new inode created
on another node. The look up calls d_splice_alias which then tries to
rehash the dentry - so the solution here is to simply check for that
in d_splice_alias. The same issue is likely to affect any other cluster
filesystem implementing -&gt;atomic_open

Signed-off-by: Steven Whitehouse &lt;swhiteho@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "J. Bruce Fields" &lt;bfields fieldses org&gt;
Cc: Jeff Layton &lt;jlayton@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>vfs: use list_move instead of list_del/list_add</title>
<updated>2013-05-04T19:43:02+00:00</updated>
<author>
<name>Wei Yongjun</name>
<email>yongjun_wei@trendmicro.com.cn</email>
</author>
<published>2013-03-11T16:10:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9ed53b12a9a60f4d52228335e76cbbdf0c7e37fb'/>
<id>9ed53b12a9a60f4d52228335e76cbbdf0c7e37fb</id>
<content type='text'>
Using list_move() instead of list_del() + list_add().

Signed-off-by: Wei Yongjun &lt;yongjun_wei@trendmicro.com.cn&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using list_move() instead of list_del() + list_add().

Signed-off-by: Wei Yongjun &lt;yongjun_wei@trendmicro.com.cn&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: remove dentry_lru_prune()</title>
<updated>2013-05-04T19:04:01+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zheng.z.yan@intel.com</email>
</author>
<published>2013-04-15T06:13:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=61572bb1f40b9bec0acbb4d7bc0f5b33739f1ab1'/>
<id>61572bb1f40b9bec0acbb4d7bc0f5b33739f1ab1</id>
<content type='text'>
When pruning a dentry, its ancestor dentry can also be pruned. But
the ancestor dentry does not go through dput(), so it does not get
put on the dentry LRU. Hence associating d_prune with removing the
dentry from the LRU is the wrong.

The fix is remove dentry_lru_prune(). Call file system's d_prune()
callback directly when pruning dentries.

Signed-off-by: Yan, Zheng &lt;zheng.z.yan@intel.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When pruning a dentry, its ancestor dentry can also be pruned. But
the ancestor dentry does not go through dput(), so it does not get
put on the dentry LRU. Hence associating d_prune with removing the
dentry from the LRU is the wrong.

The fix is remove dentry_lru_prune(). Call file system's d_prune()
callback directly when pruning dentries.

Signed-off-by: Yan, Zheng &lt;zheng.z.yan@intel.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/dcache.c: add cond_resched() to shrink_dcache_parent()</title>
<updated>2013-05-01T00:04:00+00:00</updated>
<author>
<name>Greg Thelen</name>
<email>gthelen@google.com</email>
</author>
<published>2013-04-30T22:26:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=421348f1ca0bf17769dee0aed4d991845ae0536d'/>
<id>421348f1ca0bf17769dee0aed4d991845ae0536d</id>
<content type='text'>
Call cond_resched() in shrink_dcache_parent() to maintain interactivity.

Before this patch:

	void shrink_dcache_parent(struct dentry * parent)
	{
		while ((found = select_parent(parent, &amp;dispose)) != 0)
			shrink_dentry_list(&amp;dispose);
	}

select_parent() populates the dispose list with dentries which
shrink_dentry_list() then deletes.  select_parent() carefully uses
need_resched() to avoid doing too much work at once.  But neither
shrink_dcache_parent() nor its called functions call cond_resched().  So
once need_resched() is set select_parent() will return single dentry
dispose list which is then deleted by shrink_dentry_list().  This is
inefficient when there are a lot of dentry to process.  This can cause
softlockup and hurts interactivity on non preemptable kernels.

This change adds cond_resched() in shrink_dcache_parent().  The benefit
of this is that need_resched() is quickly cleared so that future calls
to select_parent() are able to efficiently return a big batch of dentry.

These additional cond_resched() do not seem to impact performance, at
least for the workload below.

Here is a program which can cause soft lockup if other system activity
sets need_resched().

	int main()
	{
	        struct rlimit rlim;
	        int i;
	        int f[100000];
	        char buf[20];
	        struct timeval t1, t2;
	        double diff;

	        /* cleanup past run */
	        system("rm -rf x");

	        /* boost nfile rlimit */
	        rlim.rlim_cur = 200000;
	        rlim.rlim_max = 200000;
	        if (setrlimit(RLIMIT_NOFILE, &amp;rlim))
	                err(1, "setrlimit");

	        /* make directory for files */
	        if (mkdir("x", 0700))
	                err(1, "mkdir");

	        if (gettimeofday(&amp;t1, NULL))
	                err(1, "gettimeofday");

	        /* populate directory with open files */
	        for (i = 0; i &lt; 100000; i++) {
	                snprintf(buf, sizeof(buf), "x/%d", i);
	                f[i] = open(buf, O_CREAT);
	                if (f[i] == -1)
	                        err(1, "open");
	        }

	        /* close some of the files */
	        for (i = 0; i &lt; 85000; i++)
	                close(f[i]);

	        /* unlink all files, even open ones */
	        system("rm -rf x");

	        if (gettimeofday(&amp;t2, NULL))
	                err(1, "gettimeofday");

	        diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
	                ((double)t1.tv_sec * 1000000 + t1.tv_usec));

	        printf("done: %g elapsed\n", diff/1e6);
	        return 0;
	}

Signed-off-by: Greg Thelen &lt;gthelen@google.com&gt;
Signed-off-by: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Call cond_resched() in shrink_dcache_parent() to maintain interactivity.

Before this patch:

	void shrink_dcache_parent(struct dentry * parent)
	{
		while ((found = select_parent(parent, &amp;dispose)) != 0)
			shrink_dentry_list(&amp;dispose);
	}

select_parent() populates the dispose list with dentries which
shrink_dentry_list() then deletes.  select_parent() carefully uses
need_resched() to avoid doing too much work at once.  But neither
shrink_dcache_parent() nor its called functions call cond_resched().  So
once need_resched() is set select_parent() will return single dentry
dispose list which is then deleted by shrink_dentry_list().  This is
inefficient when there are a lot of dentry to process.  This can cause
softlockup and hurts interactivity on non preemptable kernels.

This change adds cond_resched() in shrink_dcache_parent().  The benefit
of this is that need_resched() is quickly cleared so that future calls
to select_parent() are able to efficiently return a big batch of dentry.

These additional cond_resched() do not seem to impact performance, at
least for the workload below.

Here is a program which can cause soft lockup if other system activity
sets need_resched().

	int main()
	{
	        struct rlimit rlim;
	        int i;
	        int f[100000];
	        char buf[20];
	        struct timeval t1, t2;
	        double diff;

	        /* cleanup past run */
	        system("rm -rf x");

	        /* boost nfile rlimit */
	        rlim.rlim_cur = 200000;
	        rlim.rlim_max = 200000;
	        if (setrlimit(RLIMIT_NOFILE, &amp;rlim))
	                err(1, "setrlimit");

	        /* make directory for files */
	        if (mkdir("x", 0700))
	                err(1, "mkdir");

	        if (gettimeofday(&amp;t1, NULL))
	                err(1, "gettimeofday");

	        /* populate directory with open files */
	        for (i = 0; i &lt; 100000; i++) {
	                snprintf(buf, sizeof(buf), "x/%d", i);
	                f[i] = open(buf, O_CREAT);
	                if (f[i] == -1)
	                        err(1, "open");
	        }

	        /* close some of the files */
	        for (i = 0; i &lt; 85000; i++)
	                close(f[i]);

	        /* unlink all files, even open ones */
	        system("rm -rf x");

	        if (gettimeofday(&amp;t2, NULL))
	                err(1, "gettimeofday");

	        diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
	                ((double)t1.tv_sec * 1000000 + t1.tv_usec));

	        printf("done: %g elapsed\n", diff/1e6);
	        return 0;
	}

Signed-off-by: Greg Thelen &lt;gthelen@google.com&gt;
Signed-off-by: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
