<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/ext3, branch v2.6.23</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>ext34: ensure do_split leaves enough free space in both blocks</title>
<updated>2007-09-19T18:24:18+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@redhat.com</email>
</author>
<published>2007-09-19T05:46:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ef2b02d3e617cb0400eedf2668f86215e1b0e6af'/>
<id>ef2b02d3e617cb0400eedf2668f86215e1b0e6af</id>
<content type='text'>
The do_split() function for htree dir blocks is intended to split a leaf
block to make room for a new entry.  It sorts the entries in the original
block by hash value, then moves the last half of the entries to the new
block - without accounting for how much space this actually moves.  (IOW,
it moves half of the entry *count* not half of the entry *space*).  If by
chance we have both large &amp; small entries, and we move only the smallest
entries, and we have a large new entry to insert, we may not have created
enough space for it.

The patch below stores each record size when calculating the dx_map, and
then walks the hash-sorted dx_map, calculating how many entries must be
moved to more evenly split the existing entries between the old block and
the new block, guaranteeing enough space for the new entry.

The dx_map "offs" member is reduced to u16 so that the overall map size
does not change - it is temporarily stored at the end of the new block, and
if it grows too large it may be overwritten.  By making offs and size both
u16, we won't grow the map size.

Also add a few comments to the functions involved.

This fixes the testcase reported by hooanon05@yahoo.co.jp on the
linux-ext4 list, "ext3 dir_index causes an error"

Thanks to Andreas Dilger for discussing the problem &amp; solution with me.

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: Andreas Dilger &lt;adilger@clusterfs.com&gt;
Tested-by: Junjiro Okajima &lt;hooanon05@yahoo.co.jp&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The do_split() function for htree dir blocks is intended to split a leaf
block to make room for a new entry.  It sorts the entries in the original
block by hash value, then moves the last half of the entries to the new
block - without accounting for how much space this actually moves.  (IOW,
it moves half of the entry *count* not half of the entry *space*).  If by
chance we have both large &amp; small entries, and we move only the smallest
entries, and we have a large new entry to insert, we may not have created
enough space for it.

The patch below stores each record size when calculating the dx_map, and
then walks the hash-sorted dx_map, calculating how many entries must be
moved to more evenly split the existing entries between the old block and
the new block, guaranteeing enough space for the new entry.

The dx_map "offs" member is reduced to u16 so that the overall map size
does not change - it is temporarily stored at the end of the new block, and
if it grows too large it may be overwritten.  By making offs and size both
u16, we won't grow the map size.

Also add a few comments to the functions involved.

This fixes the testcase reported by hooanon05@yahoo.co.jp on the
linux-ext4 list, "ext3 dir_index causes an error"

Thanks to Andreas Dilger for discussing the problem &amp; solution with me.

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: Andreas Dilger &lt;adilger@clusterfs.com&gt;
Tested-by: Junjiro Okajima &lt;hooanon05@yahoo.co.jp&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dir_index: error out instead of BUG on corrupt dx dirs</title>
<updated>2007-09-19T18:24:18+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@redhat.com</email>
</author>
<published>2007-09-19T05:46:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3d82abae9523c33d4a16fdfdfd2bdde316d7b56a'/>
<id>3d82abae9523c33d4a16fdfdfd2bdde316d7b56a</id>
<content type='text'>
Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable
errors with helpful warnings.  With help catching other asserts from Duane
Griffin &lt;duaneg@dghda.com&gt;

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Acked-by: Duane Griffin &lt;duaneg@dghda.com&gt;
Acked-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert asserts (BUGs) in dx_probe from bad on-disk data to recoverable
errors with helpful warnings.  With help catching other asserts from Duane
Griffin &lt;duaneg@dghda.com&gt;

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Acked-by: Duane Griffin &lt;duaneg@dghda.com&gt;
Acked-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>quota: fix infinite loop</title>
<updated>2007-09-12T00:21:19+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2007-09-11T22:23:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9c3013e9b91ad23ecae88e45405e98208cce455d'/>
<id>9c3013e9b91ad23ecae88e45405e98208cce455d</id>
<content type='text'>
If we fail to start a transaction when releasing dquot, we have to call
dquot_release() anyway to mark dquot structure as inactive.  Otherwise we
end in an infinite loop inside dqput().

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: xb &lt;xavier.bru@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we fail to start a transaction when releasing dquot, we have to call
dquot_release() anyway to mark dquot structure as inactive.  Otherwise we
end in an infinite loop inside dqput().

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: xb &lt;xavier.bru@bull.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fix inode_table test in ext234_check_descriptors</title>
<updated>2007-07-26T18:35:17+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@redhat.com</email>
</author>
<published>2007-07-26T17:41:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=780dcdb21170ae8ad3faa640ede249261f216a8c'/>
<id>780dcdb21170ae8ad3faa640ede249261f216a8c</id>
<content type='text'>
ext[234]_check_descriptors sanity checks block group descriptor geometry at
mount time, testing whether the block bitmap, inode bitmap, and inode table
reside wholly within the blockgroup.  However, the inode table test is off
by one so that if the last block in the inode table resides on the last
block of the block group, the test incorrectly fails.  This is because it
tests the last block as (start + length) rather than (start + length - 1).

This can be seen by trying to mount a filesystem made such as:

 mkfs.ext2 -F -b 1024 -m 0 -g 256 -N 3744 fsfile 1024

which yields:

 EXT2-fs error (device loop0): ext2_check_descriptors: Inode table for group 0 not in group (block 101)!
 EXT2-fs: group descriptors corrupted!

There is a similar bug in e2fsprogs, patch already sent for that.

(I wonder if inside(), outside(), and/or in_range() should someday be
used in this and other tests throughout the ext filesystems...)

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ext[234]_check_descriptors sanity checks block group descriptor geometry at
mount time, testing whether the block bitmap, inode bitmap, and inode table
reside wholly within the blockgroup.  However, the inode table test is off
by one so that if the last block in the inode table resides on the last
block of the block group, the test incorrectly fails.  This is because it
tests the last block as (start + length) rather than (start + length - 1).

This can be seen by trying to mount a filesystem made such as:

 mkfs.ext2 -F -b 1024 -m 0 -g 256 -N 3744 fsfile 1024

which yields:

 EXT2-fs error (device loop0): ext2_check_descriptors: Inode table for group 0 not in group (block 101)!
 EXT2-fs: group descriptors corrupted!

There is a similar bug in e2fsprogs, patch already sent for that.

(I wonder if inside(), outside(), and/or in_range() should someday be
used in this and other tests throughout the ext filesystems...)

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Cc: &lt;linux-ext4@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: Remove slab destructors from kmem_cache_create().</title>
<updated>2007-07-20T01:11:58+00:00</updated>
<author>
<name>Paul Mundt</name>
<email>lethal@linux-sh.org</email>
</author>
<published>2007-07-20T01:11:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=20c2df83d25c6a95affe6157a4c9cac4cf5ffaac'/>
<id>20c2df83d25c6a95affe6157a4c9cac4cf5ffaac</id>
<content type='text'>
Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt &lt;lethal@linux-sh.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt &lt;lethal@linux-sh.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>readahead: split ondemand readahead interface into two functions</title>
<updated>2007-07-19T17:04:44+00:00</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2007-07-19T08:48:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cf914a7d656e62b9dd3e0dffe4f62b953ae6048d'/>
<id>cf914a7d656e62b9dd3e0dffe4f62b953ae6048d</id>
<content type='text'>
Split ondemand readahead interface into two functions.  I think this makes it
a little clearer for non-readahead experts (like Rusty).

Internally they both call ondemand_readahead(), but the page argument is
changed to an obvious boolean flag.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Fengguang Wu &lt;wfg@mail.ustc.edu.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Split ondemand readahead interface into two functions.  I think this makes it
a little clearer for non-readahead experts (like Rusty).

Internally they both call ondemand_readahead(), but the page argument is
changed to an obvious boolean flag.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Fengguang Wu &lt;wfg@mail.ustc.edu.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>readahead: convert ext3/ext4 invocations</title>
<updated>2007-07-19T17:04:44+00:00</updated>
<author>
<name>Fengguang Wu</name>
<email>wfg@mail.ustc.edu.cn</email>
</author>
<published>2007-07-19T08:48:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dc7868fcb9a73990e6f30371c1be465c436a7a7f'/>
<id>dc7868fcb9a73990e6f30371c1be465c436a7a7f</id>
<content type='text'>
Convert ext3/ext4 dir reads to use on-demand readahead.

Readahead for dirs operates _not_ on file level, but on blockdev level.  This
makes a difference when the data blocks are not continuous.  And the read
routine is somehow opaque: there's no handy info about the status of current
page.  So a simplified call scheme is employed: to call into readahead
whenever the current page falls out of readahead windows.

Signed-off-by: Fengguang Wu &lt;wfg@mail.ustc.edu.cn&gt;
Cc: Steven Pratt &lt;slpratt@austin.ibm.com&gt;
Cc: Ram Pai &lt;linuxram@us.ibm.com&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Convert ext3/ext4 dir reads to use on-demand readahead.

Readahead for dirs operates _not_ on file level, but on blockdev level.  This
makes a difference when the data blocks are not continuous.  And the read
routine is somehow opaque: there's no handy info about the status of current
page.  So a simplified call scheme is employed: to call into readahead
whenever the current page falls out of readahead windows.

Signed-off-by: Fengguang Wu &lt;wfg@mail.ustc.edu.cn&gt;
Cc: Steven Pratt &lt;slpratt@austin.ibm.com&gt;
Cc: Ram Pai &lt;linuxram@us.ibm.com&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check</title>
<updated>2007-07-17T19:00:03+00:00</updated>
<author>
<name>Satyam Sharma</name>
<email>ssatyam@cse.iitk.ac.in</email>
</author>
<published>2007-07-17T09:30:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3bd858ab1c451725c07a805dcb315215dc85b86e'/>
<id>3bd858ab1c451725c07a805dcb315215dc85b86e</id>
<content type='text'>
Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]

The (current-&gt;fsuid != inode-&gt;i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.

Signed-off-by: Satyam Sharma &lt;ssatyam@cse.iitk.ac.in&gt;
Cc: Al Viro &lt;viro@ftp.linux.org.uk&gt;
Acked-by: Serge E. Hallyn &lt;serge@hallyn.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]

The (current-&gt;fsuid != inode-&gt;i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.

Signed-off-by: Satyam Sharma &lt;ssatyam@cse.iitk.ac.in&gt;
Cc: Al Viro &lt;viro@ftp.linux.org.uk&gt;
Acked-by: Serge E. Hallyn &lt;serge@hallyn.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>knfsd: exportfs: add exportfs.h header</title>
<updated>2007-07-17T17:23:06+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@infradead.org</email>
</author>
<published>2007-07-17T11:04:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a569425512253992cc64ebf8b6d00a62f986db3e'/>
<id>a569425512253992cc64ebf8b6d00a62f986db3e</id>
<content type='text'>
currently the export_operation structure and helpers related to it are in
fs.h.  fs.h is already far too large and there are very few places needing the
export bits, so split them off into a separate header.

[akpm@linux-foundation.org: fix cifs build]
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Neil Brown &lt;neilb@suse.de&gt;
Cc: Steven French &lt;sfrench@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
currently the export_operation structure and helpers related to it are in
fs.h.  fs.h is already far too large and there are very few places needing the
export bits, so split them off into a separate header.

[akpm@linux-foundation.org: fix cifs build]
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Neil Brown &lt;neilb@suse.de&gt;
Cc: Steven French &lt;sfrench@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext3: statfs speed up</title>
<updated>2007-07-16T16:05:52+00:00</updated>
<author>
<name>Badari Pulavarty</name>
<email>pbadari@us.ibm.com</email>
</author>
<published>2007-07-16T06:41:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a71ce8c6c9bf269b192f352ea555217815cf027e'/>
<id>a71ce8c6c9bf269b192f352ea555217815cf027e</id>
<content type='text'>
This is a patch that speeds up statfs.  It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes.  That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).

It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted.  While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.

Signed-off-by: Badari Pulavarty &lt;pbadari@us.ibm.com&gt;
Signed-off-by: Andreas Dilger &lt;adilger@clusterfs.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a patch that speeds up statfs.  It is very simple - the "overhead"
calculation, which takes a huge amount of time for large filesystems, never
changes unless the size of the filesystem itself changes.  That means we can
store it in memory and only recalculate if the filesystem has been resized
(almost never).

It also fixes a minor problem that we never update the on-disk superblock free
blocks/inodes counts until the filesystem is unmounted.  While not fatal, we
may as well update that on disk when we have the information, and it makes
things like debugfs and dumpe2fs report a bit more accurate info.

Signed-off-by: Badari Pulavarty &lt;pbadari@us.ibm.com&gt;
Signed-off-by: Andreas Dilger &lt;adilger@clusterfs.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
