<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/inode.c, branch v2.6.29</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>fs: new inode i_state corruption fix</title>
<updated>2009-03-12T23:20:24+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2009-03-12T21:31:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7ef0d7377cb287e08f3ae94cebc919448e1f5dff'/>
<id>7ef0d7377cb287e08f3ae94cebc919448e1f5dff</id>
<content type='text'>
There was a report of a data corruption
http://lkml.org/lkml/2008/11/14/121.  There is a script included to
reproduce the problem.

During testing, I encountered a number of strange things with ext3, so I
tried ext2 to attempt to reduce complexity of the problem.  I found that
fsstress would quickly hang in wait_on_inode, waiting for I_LOCK to be
cleared, even though instrumentation showed that unlock_new_inode had
already been called for that inode.  This points to memory scribble, or
synchronisation problme.

i_state of I_NEW inodes is not protected by inode_lock because other
processes are not supposed to touch them until I_LOCK (and I_NEW) is
cleared.  Adding WARN_ON(inode-&gt;i_state &amp; I_NEW) to sites where we modify
i_state revealed that generic_sync_sb_inodes is picking up new inodes from
the inode lists and passing them to __writeback_single_inode without
waiting for I_NEW.  Subsequently modifying i_state causes corruption.  In
my case it would look like this:

CPU0                            CPU1
unlock_new_inode()              __sync_single_inode()
 reg &lt;- inode-&gt;i_state
 reg -&gt; reg &amp; ~(I_LOCK|I_NEW)   reg &lt;- inode-&gt;i_state
 reg -&gt; inode-&gt;i_state          reg -&gt; reg | I_SYNC
                                reg -&gt; inode-&gt;i_state

Non-atomic RMW on CPU1 overwrites CPU0 store and sets I_LOCK|I_NEW again.

Fix for this is rather than wait for I_NEW inodes, just skip over them:
inodes concurrently being created are not subject to data integrity
operations, and should not significantly contribute to dirty memory
either.

After this change, I'm unable to reproduce any of the added warnings or
hangs after ~1hour of running.  Previously, the new warnings would start
immediately and hang would happen in under 5 minutes.

I'm also testing on ext3 now, and so far no problems there either.  I
don't know whether this fixes the problem reported above, but it fixes a
real problem for me.

Cc: "Jorge Boncompte [DTI2]" &lt;jorge@dti2.net&gt;
Reported-by: Adrian Hunter &lt;ext-adrian.hunter@nokia.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There was a report of a data corruption
http://lkml.org/lkml/2008/11/14/121.  There is a script included to
reproduce the problem.

During testing, I encountered a number of strange things with ext3, so I
tried ext2 to attempt to reduce complexity of the problem.  I found that
fsstress would quickly hang in wait_on_inode, waiting for I_LOCK to be
cleared, even though instrumentation showed that unlock_new_inode had
already been called for that inode.  This points to memory scribble, or
synchronisation problme.

i_state of I_NEW inodes is not protected by inode_lock because other
processes are not supposed to touch them until I_LOCK (and I_NEW) is
cleared.  Adding WARN_ON(inode-&gt;i_state &amp; I_NEW) to sites where we modify
i_state revealed that generic_sync_sb_inodes is picking up new inodes from
the inode lists and passing them to __writeback_single_inode without
waiting for I_NEW.  Subsequently modifying i_state causes corruption.  In
my case it would look like this:

CPU0                            CPU1
unlock_new_inode()              __sync_single_inode()
 reg &lt;- inode-&gt;i_state
 reg -&gt; reg &amp; ~(I_LOCK|I_NEW)   reg &lt;- inode-&gt;i_state
 reg -&gt; inode-&gt;i_state          reg -&gt; reg | I_SYNC
                                reg -&gt; inode-&gt;i_state

Non-atomic RMW on CPU1 overwrites CPU0 store and sets I_LOCK|I_NEW again.

Fix for this is rather than wait for I_NEW inodes, just skip over them:
inodes concurrently being created are not subject to data integrity
operations, and should not significantly contribute to dirty memory
either.

After this change, I'm unable to reproduce any of the added warnings or
hangs after ~1hour of running.  Previously, the new warnings would start
immediately and hang would happen in under 5 minutes.

I'm also testing on ext3 now, and so far no problems there either.  I
don't know whether this fixes the problem reported above, but it fixes a
real problem for me.

Cc: "Jorge Boncompte [DTI2]" &lt;jorge@dti2.net&gt;
Reported-by: Adrian Hunter &lt;ext-adrian.hunter@nokia.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>partial revert of asynchronous inode delete</title>
<updated>2009-01-09T21:15:49+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2009-01-09T15:04:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b32714ba29358a688ef337d5297bf4bdc9f596dc'/>
<id>b32714ba29358a688ef337d5297bf4bdc9f596dc</id>
<content type='text'>
let the core of this one bake in -next as well, but leave
some of the infrastructure in place.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
let the core of this one bake in -next as well, but leave
some of the infrastructure in place.

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>async: make the final inode deletion an asynchronous event</title>
<updated>2009-01-07T16:47:24+00:00</updated>
<author>
<name>Arjan van de Ven</name>
<email>arjan@linux.intel.com</email>
</author>
<published>2009-01-06T15:20:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=efaee192063a54749c56b7383803e16fe553630e'/>
<id>efaee192063a54749c56b7383803e16fe553630e</id>
<content type='text'>
this makes "rm -rf" on a (names cached) kernel tree go from
11.6 to 8.6 seconds on an ext3 filesystem

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
this makes "rm -rf" on a (names cached) kernel tree go from
11.6 to 8.6 seconds on an ext3 filesystem

Signed-off-by: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/inode: fix kernel-doc notation</title>
<updated>2009-01-06T23:59:14+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>randy.dunlap@oracle.com</email>
</author>
<published>2009-01-06T22:41:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0bc02f3fa433a98631a932e77c4b1f873da35aee'/>
<id>0bc02f3fa433a98631a932e77c4b1f873da35aee</id>
<content type='text'>
Fix kernel-doc notation:

Warning(linux-2.6.28-git3//fs/inode.c:120): No description found for parameter 'sb'
Warning(linux-2.6.28-git3//fs/inode.c:120): No description found for parameter 'inode'
Warning(linux-2.6.28-git3//fs/inode.c:588): No description found for parameter 'sb'
Warning(linux-2.6.28-git3//fs/inode.c:588): No description found for parameter 'inode'

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix kernel-doc notation:

Warning(linux-2.6.28-git3//fs/inode.c:120): No description found for parameter 'sb'
Warning(linux-2.6.28-git3//fs/inode.c:120): No description found for parameter 'inode'
Warning(linux-2.6.28-git3//fs/inode.c:588): No description found for parameter 'sb'
Warning(linux-2.6.28-git3//fs/inode.c:588): No description found for parameter 'inode'

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: remove GFP_HIGHUSER_PAGECACHE</title>
<updated>2009-01-06T23:59:01+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hugh@veritas.com</email>
</author>
<published>2009-01-06T22:39:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3c1d43787b48c798f44dc32a6e6deb5ca2da3e68'/>
<id>3c1d43787b48c798f44dc32a6e6deb5ca2da3e68</id>
<content type='text'>
GFP_HIGHUSER_PAGECACHE is just an alias for GFP_HIGHUSER_MOVABLE, making
that harder to track down: remove it, and its out-of-work brothers
GFP_NOFS_PAGECACHE and GFP_USER_PAGECACHE.

Since we're making that improvement to hotremove_migrate_alloc(), I think
we can now also remove one of the "o"s from its comment.

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
GFP_HIGHUSER_PAGECACHE is just an alias for GFP_HIGHUSER_MOVABLE, making
that harder to track down: remove it, and its out-of-work brothers
GFP_NOFS_PAGECACHE and GFP_USER_PAGECACHE.

Since we're making that improvement to hotremove_migrate_alloc(), I think
we can now also remove one of the "o"s from its comment.

Signed-off-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>zero i_uid/i_gid on inode allocation</title>
<updated>2009-01-05T16:54:28+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2008-12-09T14:34:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=56ff5efad96182f4d3cb3dc6b07396762c658f16'/>
<id>56ff5efad96182f4d3cb3dc6b07396762c658f16</id>
<content type='text'>
... and don't bother in callers.  Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.

i_mode is not worth the effort; it has no common default value.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... and don't bother in callers.  Don't bother with zeroing i_blocks,
while we are at it - it's already been zeroed.

i_mode is not worth the effort; it has no common default value.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfsd/create race fixes, infrastructure</title>
<updated>2008-12-31T23:07:43+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2008-12-30T06:48:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=261bca86ed4f7f391d1938167624e78da61dcc6b'/>
<id>261bca86ed4f7f391d1938167624e78da61dcc6b</id>
<content type='text'>
new helpers - insert_inode_locked() and insert_inode_locked4().
Hash new inode, making sure that there's no such inode in icache
already.  If there is and it does not end up unhashed (as would
happen if we have nfsd trying to resolve a bogus fhandle), fail.
Otherwise insert our inode into hash and succeed.

In either case have i_state set to new+locked; cleanup ends up
being simpler with such calling conventions.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
new helpers - insert_inode_locked() and insert_inode_locked4().
Hash new inode, making sure that there's no such inode in icache
already.  If there is and it does not end up unhashed (as would
happen if we have nfsd trying to resolve a bogus fhandle), fail.
Otherwise insert our inode into hash and succeed.

In either case have i_state set to new+locked; cleanup ends up
being simpler with such calling conventions.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: xfs needs inode_wait to be exported</title>
<updated>2008-11-10T06:06:05+00:00</updated>
<author>
<name>Stephen Rothwell</name>
<email>sfr@canb.auug.org.au</email>
</author>
<published>2008-11-10T06:06:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d44dab8d1cde8aeba1faf44a7654f90800feb7fc'/>
<id>d44dab8d1cde8aeba1faf44a7654f90800feb7fc</id>
<content type='text'>
Since wait_on_inode() references it.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Reviewed-by: Dave Chinner &lt;david@fromorbit.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Lachlan McIlroy &lt;lachlan@sgi.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since wait_on_inode() references it.

Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Reviewed-by: Dave Chinner &lt;david@fromorbit.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Lachlan McIlroy &lt;lachlan@sgi.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Inode: export symbol destroy_inode</title>
<updated>2008-10-30T07:24:37+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@infradead.org</email>
</author>
<published>2008-10-30T07:24:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=087e3b0460c367d0f4a5b71d7b013968ae23b588'/>
<id>087e3b0460c367d0f4a5b71d7b013968ae23b588</id>
<content type='text'>
To make sure we free the security data inodes need to be freed using
the proper VFS helper (which we also need to export for this). We mark
these inodes bad so we can skip the flush path for them.

Signed-off-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Lachlan McIlroy &lt;lachlan@sgi.com&gt;
Signed-off-by: David Chinner &lt;david@fromorbit.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To make sure we free the security data inodes need to be freed using
the proper VFS helper (which we also need to export for this). We mark
these inodes bad so we can skip the flush path for them.

Signed-off-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Lachlan McIlroy &lt;lachlan@sgi.com&gt;
Signed-off-by: David Chinner &lt;david@fromorbit.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Inode: Allow external list initialisation</title>
<updated>2008-10-30T06:35:24+00:00</updated>
<author>
<name>David Chinner</name>
<email>david@fromorbit.com</email>
</author>
<published>2008-10-30T06:35:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8290c35f87304a6b73d4fd17b03580b4f7425de8'/>
<id>8290c35f87304a6b73d4fd17b03580b4f7425de8</id>
<content type='text'>
To allow XFS to combine the XFS and linux inodes into a single
structure, we need to drive inode lookup from the XFS inode cache,
not the generic inode cache. This means that we need initialise a
struct inode from a context outside alloc_inode() as it is no longer
used by XFS.

After inode allocation and initialisation, we need to add the inode
to the superblock list, the in-use list, hash it and do some
accounting. This all needs to be done with the inode_lock held and
there are already several places in fs/inode.c that do this list
manipulation.  Factor out the common code, add a locking wrapper and
export the function so ti can be called from XFS.

Signed-off-by: Dave Chinner &lt;david@fromorbit.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Lachlan McIlroy &lt;lachlan@sgi.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To allow XFS to combine the XFS and linux inodes into a single
structure, we need to drive inode lookup from the XFS inode cache,
not the generic inode cache. This means that we need initialise a
struct inode from a context outside alloc_inode() as it is no longer
used by XFS.

After inode allocation and initialisation, we need to add the inode
to the superblock list, the in-use list, hash it and do some
accounting. This all needs to be done with the inode_lock held and
there are already several places in fs/inode.c that do this list
manipulation.  Factor out the common code, add a locking wrapper and
export the function so ti can be called from XFS.

Signed-off-by: Dave Chinner &lt;david@fromorbit.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Lachlan McIlroy &lt;lachlan@sgi.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
