<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/dcache.c, branch v2.6.16</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>[PATCH] fix file counting</title>
<updated>2006-03-08T22:14:01+00:00</updated>
<author>
<name>Dipankar Sarma</name>
<email>dipankar@in.ibm.com</email>
</author>
<published>2006-03-08T05:55:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=529bf6be5c04f2e869d07bfdb122e9fd98ade714'/>
<id>529bf6be5c04f2e869d07bfdb122e9fd98ade714</id>
<content type='text'>
I have benchmarked this on an x86_64 NUMA system and see no significant
performance difference on kernbench.  Tested on both x86_64 and powerpc.

The way we do file struct accounting is not very suitable for batched
freeing.  For scalability reasons, file accounting was
constructor/destructor based.  This meant that nr_files was decremented
only when the object was removed from the slab cache.  This is susceptible
to slab fragmentation.  With RCU based file structure, consequent batched
freeing and a test program like Serge's, we just speed this up and end up
with a very fragmented slab -

llm22:~ # cat /proc/sys/fs/file-nr
587730  0       758844

At the same time, I see only a 2000+ objects in filp cache.  The following
patch I fixes this problem.

This patch changes the file counting by removing the filp_count_lock.
Instead we use a separate percpu counter, nr_files, for now and all
accesses to it are through get_nr_files() api.  In the sysctl handler for
nr_files, we populate files_stat.nr_files before returning to user.

Counting files as an when they are created and destroyed (as opposed to
inside slab) allows us to correctly count open files with RCU.

Signed-off-by: Dipankar Sarma &lt;dipankar@in.ibm.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@us.ibm.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I have benchmarked this on an x86_64 NUMA system and see no significant
performance difference on kernbench.  Tested on both x86_64 and powerpc.

The way we do file struct accounting is not very suitable for batched
freeing.  For scalability reasons, file accounting was
constructor/destructor based.  This meant that nr_files was decremented
only when the object was removed from the slab cache.  This is susceptible
to slab fragmentation.  With RCU based file structure, consequent batched
freeing and a test program like Serge's, we just speed this up and end up
with a very fragmented slab -

llm22:~ # cat /proc/sys/fs/file-nr
587730  0       758844

At the same time, I see only a 2000+ objects in filp cache.  The following
patch I fixes this problem.

This patch changes the file counting by removing the filp_count_lock.
Instead we use a separate percpu counter, nr_files, for now and all
accesses to it are through get_nr_files() api.  In the sysctl handler for
nr_files, we populate files_stat.nr_files before returning to user.

Counting files as an when they are created and destroyed (as opposed to
inside slab) allows us to correctly count open files with RCU.

Signed-off-by: Dipankar Sarma &lt;dipankar@in.ibm.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@us.ibm.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] make "struct d_cookie" depend on CONFIG_PROFILING</title>
<updated>2006-02-03T16:32:04+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>marcelo.tosatti@cyclades.com</email>
</author>
<published>2006-02-03T11:04:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=47ba87e0b1269698801310bfd1716b0538282405'/>
<id>47ba87e0b1269698801310bfd1716b0538282405</id>
<content type='text'>
Shrinks "struct dentry" from 128 bytes to 124 on x86, allowing 31 objects
per slab instead of 30.

Cc: John Levon &lt;levon@movementarian.org&gt;
Cc: Philippe Elie &lt;phil.el@wanadoo.fr&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Shrinks "struct dentry" from 128 bytes to 124 on x86, allowing 31 objects
per slab instead of 30.

Cc: John Levon &lt;levon@movementarian.org&gt;
Cc: Philippe Elie &lt;phil.el@wanadoo.fr&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] Unlinline a bunch of other functions</title>
<updated>2006-01-15T02:27:06+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@infradead.org</email>
</author>
<published>2006-01-14T21:20:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=858119e159384308a5dde67776691a2ebf70df0f'/>
<id>858119e159384308a5dde67776691a2ebf70df0f</id>
<content type='text'>
Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb

Signed-off-by: Arjan van de Ven &lt;arjan@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Acked-by: Jeff Garzik &lt;jgarzik@pobox.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb

Signed-off-by: Arjan van de Ven &lt;arjan@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Acked-by: Jeff Garzik &lt;jgarzik@pobox.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] d_instantiate_unique / NFS inode leakage</title>
<updated>2006-01-10T16:01:41+00:00</updated>
<author>
<name>Oleg Drokin</name>
<email>green@linuxhacker.ru</email>
</author>
<published>2006-01-10T04:52:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e866cfa939de7f52c154a9495eb5767f89abf453'/>
<id>e866cfa939de7f52c154a9495eb5767f89abf453</id>
<content type='text'>
If we have found aliased dentry that we return, inode reference is not
dropped and inode is not attached anywhere, so it seems the reference to
inode is leaked in that case.

Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;,
Cc: &lt;viro@parcelfarce.linux.theplanet.co.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we have found aliased dentry that we return, inode reference is not
dropped and inode is not attached anywhere, so it seems the reference to
inode is leaked in that case.

Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;,
Cc: &lt;viro@parcelfarce.linux.theplanet.co.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] shrink dentry struct</title>
<updated>2006-01-09T04:13:58+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>dada1@cosmosbay.com</email>
</author>
<published>2006-01-08T09:03:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5160ee6fc891a9ca114be0e90fa6655647bb64b2'/>
<id>5160ee6fc891a9ca114be0e90fa6655647bb64b2</id>
<content type='text'>
Some long time ago, dentry struct was carefully tuned so that on 32 bits
UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
of memory cache lines.

Then RCU was added and dentry struct enlarged by two pointers, with nice
results for SMP, but not so good on UP, because breaking the above tuning
(128 + 8 = 136 bytes)

This patch reverts this unwanted side effect, by using an union (d_u),
where d_rcu and d_child are placed so that these two fields can share their
memory needs.

At the time d_free() is called (and d_rcu is really used), d_child is known
to be empty and not touched by the dentry freeing.

Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
the previous content of d_child is not needed if said dentry was unhashed
but still accessed by a CPU because of RCU constraints)

As dentry cache easily contains millions of entries, a size reduction is
worth the extra complexity of the ugly C union.

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Cc: Dipankar Sarma &lt;dipankar@in.ibm.com&gt;
Cc: Maneesh Soni &lt;maneesh@in.ibm.com&gt;
Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Cc: "Paul E. McKenney" &lt;paulmck@us.ibm.com&gt;
Cc: Ian Kent &lt;raven@themaw.net&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Al Viro &lt;viro@ftp.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;
Cc: Neil Brown &lt;neilb@cse.unsw.edu.au&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Stephen Smalley &lt;sds@epoch.ncsc.mil&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some long time ago, dentry struct was carefully tuned so that on 32 bits
UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
of memory cache lines.

Then RCU was added and dentry struct enlarged by two pointers, with nice
results for SMP, but not so good on UP, because breaking the above tuning
(128 + 8 = 136 bytes)

This patch reverts this unwanted side effect, by using an union (d_u),
where d_rcu and d_child are placed so that these two fields can share their
memory needs.

At the time d_free() is called (and d_rcu is really used), d_child is known
to be empty and not touched by the dentry freeing.

Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
the previous content of d_child is not needed if said dentry was unhashed
but still accessed by a CPU because of RCU constraints)

As dentry cache easily contains millions of entries, a size reduction is
worth the extra complexity of the ugly C union.

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Cc: Dipankar Sarma &lt;dipankar@in.ibm.com&gt;
Cc: Maneesh Soni &lt;maneesh@in.ibm.com&gt;
Cc: Miklos Szeredi &lt;miklos@szeredi.hu&gt;
Cc: "Paul E. McKenney" &lt;paulmck@us.ibm.com&gt;
Cc: Ian Kent &lt;raven@themaw.net&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Al Viro &lt;viro@ftp.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;
Cc: Neil Brown &lt;neilb@cse.unsw.edu.au&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Stephen Smalley &lt;sds@epoch.ncsc.mil&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] Remove hlist_for_each_rcu() API, convert existing use to hlist_for_each_entry_rcu</title>
<updated>2005-11-07T15:53:35+00:00</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@us.ibm.com</email>
</author>
<published>2005-11-07T08:59:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=665a7583f32ab5b3bfe7a4d88da506542f7cdd75'/>
<id>665a7583f32ab5b3bfe7a4d88da506542f7cdd75</id>
<content type='text'>
Remove the hlist_for_each_rcu() API, which is used only in one place, and
is trivially converted to hlist_for_each_entry_rcu(), making the code
shorter and more readable.  Any out-of-tree uses may be similarly
converted.

Signed-off-by: "Paul E. McKenney" &lt;paulmck@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove the hlist_for_each_rcu() API, which is used only in one place, and
is trivially converted to hlist_for_each_entry_rcu(), making the code
shorter and more readable.  Any out-of-tree uses may be similarly
converted.

Signed-off-by: "Paul E. McKenney" &lt;paulmck@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] gfp_t: fs/*</title>
<updated>2005-10-28T15:16:47+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2005-10-21T07:20:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=27496a8c67bef4d789d8e3c8317ca35813a507ae'/>
<id>27496a8c67bef4d789d8e3c8317ca35813a507ae</id>
<content type='text'>
 - -&gt;releasepage() annotated (s/int/gfp_t), instances updated
 - missing gfp_t in fs/* added
 - fixed misannotation from the original sweep caught by bitwise checks:
   XFS used __nocast both for gfp_t and for flags used by XFS allocator.
   The latter left with unsigned int __nocast; we might want to add a
   different type for those but for now let's leave them alone.  That,
   BTW, is a case when __nocast use had been actively confusing - it had
   been used in the same code for two different and similar types, with
   no way to catch misuses.  Switch of gfp_t to bitwise had caught that
   immediately...

One tricky bit is left alone to be dealt with later - mapping-&gt;flags is
a mix of gfp_t and error indications.  Left alone for now.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
 - -&gt;releasepage() annotated (s/int/gfp_t), instances updated
 - missing gfp_t in fs/* added
 - fixed misannotation from the original sweep caught by bitwise checks:
   XFS used __nocast both for gfp_t and for flags used by XFS allocator.
   The latter left with unsigned int __nocast; we might want to add a
   different type for those but for now let's leave them alone.  That,
   BTW, is a case when __nocast use had been actively confusing - it had
   been used in the same code for two different and similar types, with
   no way to catch misuses.  Switch of gfp_t to bitwise had caught that
   immediately...

One tricky bit is left alone to be dealt with later - mapping-&gt;flags is
a mix of gfp_t and error indications.  Left alone for now.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Make fsnotify possibly work better for the inode removal case</title>
<updated>2005-09-20T02:54:29+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@g5.osdl.org</email>
</author>
<published>2005-09-20T02:54:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f805fbdaacf4367ce566743a665622387768ac0d'/>
<id>f805fbdaacf4367ce566743a665622387768ac0d</id>
<content type='text'>
Checking i_nlink is dubious, but the alternatives look even
less appetizing.

Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Checking i_nlink is dubious, but the alternatives look even
less appetizing.

Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] janitor: fs/dcache.c: list_for_each*</title>
<updated>2005-09-10T17:06:32+00:00</updated>
<author>
<name>Domen Puncer</name>
<email>domen@coderock.org</email>
</author>
<published>2005-09-10T07:27:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0cdca3f9806a3dbaa07b5e8175000cd513ba92d4'/>
<id>0cdca3f9806a3dbaa07b5e8175000cd513ba92d4</id>
<content type='text'>
First one is list_for_each_entry (thanks maks), second 2 list_for_each_safe.

Signed-off-by: Maximilian Attems &lt;janitor@sternwelten.at&gt;
Signed-off-by: Domen Puncer &lt;domen@coderock.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
First one is list_for_each_entry (thanks maks), second 2 list_for_each_safe.

Signed-off-by: Maximilian Attems &lt;janitor@sternwelten.at&gt;
Signed-off-by: Domen Puncer &lt;domen@coderock.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] fsnotify_name/inoderemove</title>
<updated>2005-08-08T18:53:47+00:00</updated>
<author>
<name>John McCutchan</name>
<email>ttb@tentacle.dhs.org</email>
</author>
<published>2005-08-08T17:52:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7a91bf7f5c22c8407a9991cbd9ce5bb87caa6b4a'/>
<id>7a91bf7f5c22c8407a9991cbd9ce5bb87caa6b4a</id>
<content type='text'>
The patch below unhooks fsnotify from vfs_unlink &amp; vfs_rmdir.  It
introduces two new fsnotify calls, that are hooked in at the dcache
level.  This not only more closely matches how the VFS layer works, it
also avoids the problem with locking and inode lifetimes.

The two functions are

 - fsnotify_nameremove -- called when a directory entry is going away.
   It notifies the PARENT of the deletion.  This is called from
   d_delete().

 - inoderemove -- called when the files inode itself is going away.  It
   notifies the inode that is being deleted.  This is called from
   dentry_iput().

Signed-off-by: John McCutchan &lt;ttb@tentacle.dhs.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The patch below unhooks fsnotify from vfs_unlink &amp; vfs_rmdir.  It
introduces two new fsnotify calls, that are hooked in at the dcache
level.  This not only more closely matches how the VFS layer works, it
also avoids the problem with locking and inode lifetimes.

The two functions are

 - fsnotify_nameremove -- called when a directory entry is going away.
   It notifies the PARENT of the deletion.  This is called from
   d_delete().

 - inoderemove -- called when the files inode itself is going away.  It
   notifies the inode that is being deleted.  This is called from
   dentry_iput().

Signed-off-by: John McCutchan &lt;ttb@tentacle.dhs.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
