<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/ext4, branch v3.17-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2014-08-11T18:44:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-08-11T18:44:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f6f993328b2abcab86a3c99d7bd9f2066ab03d36'/>
<id>f6f993328b2abcab86a3c99d7bd9f2066ab03d36</id>
<content type='text'>
Pull vfs updates from Al Viro:
 "Stuff in here:

   - acct.c fixes and general rework of mnt_pin mechanism.  That allows
     to go for delayed-mntput stuff, which will permit mntput() on deep
     stack without worrying about stack overflows - fs shutdown will
     happen on shallow stack.  IOW, we can do Eric's umount-on-rmdir
     series without introducing tons of stack overflows on new mntput()
     call chains it introduces.
   - Bruce's d_splice_alias() patches
   - more Miklos' rename() stuff.
   - a couple of regression fixes (stable fodder, in the end of branch)
     and a fix for API idiocy in iov_iter.c.

  There definitely will be another pile, maybe even two.  I'd like to
  get Eric's series in this time, but even if we miss it, it'll go right
  in the beginning of for-next in the next cycle - the tricky part of
  prereqs is in this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
  fix copy_tree() regression
  __generic_file_write_iter(): fix handling of sync error after DIO
  switch iov_iter_get_pages() to passing maximal number of pages
  fs: mark __d_obtain_alias static
  dcache: d_splice_alias should detect loops
  exportfs: update Exporting documentation
  dcache: d_find_alias needn't recheck IS_ROOT &amp;&amp; DCACHE_DISCONNECTED
  dcache: remove unused d_find_alias parameter
  dcache: d_obtain_alias callers don't all want DISCONNECTED
  dcache: d_splice_alias should ignore DCACHE_DISCONNECTED
  dcache: d_splice_alias mustn't create directory aliases
  dcache: close d_move race in d_splice_alias
  dcache: move d_splice_alias
  namei: trivial fix to vfs_rename_dir comment
  VFS: allow -&gt;d_manage() to declare -EISDIR in rcu_walk mode.
  cifs: support RENAME_NOREPLACE
  hostfs: support rename flags
  shmem: support RENAME_EXCHANGE
  shmem: support RENAME_NOREPLACE
  btrfs: add RENAME_NOREPLACE
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull vfs updates from Al Viro:
 "Stuff in here:

   - acct.c fixes and general rework of mnt_pin mechanism.  That allows
     to go for delayed-mntput stuff, which will permit mntput() on deep
     stack without worrying about stack overflows - fs shutdown will
     happen on shallow stack.  IOW, we can do Eric's umount-on-rmdir
     series without introducing tons of stack overflows on new mntput()
     call chains it introduces.
   - Bruce's d_splice_alias() patches
   - more Miklos' rename() stuff.
   - a couple of regression fixes (stable fodder, in the end of branch)
     and a fix for API idiocy in iov_iter.c.

  There definitely will be another pile, maybe even two.  I'd like to
  get Eric's series in this time, but even if we miss it, it'll go right
  in the beginning of for-next in the next cycle - the tricky part of
  prereqs is in this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
  fix copy_tree() regression
  __generic_file_write_iter(): fix handling of sync error after DIO
  switch iov_iter_get_pages() to passing maximal number of pages
  fs: mark __d_obtain_alias static
  dcache: d_splice_alias should detect loops
  exportfs: update Exporting documentation
  dcache: d_find_alias needn't recheck IS_ROOT &amp;&amp; DCACHE_DISCONNECTED
  dcache: remove unused d_find_alias parameter
  dcache: d_obtain_alias callers don't all want DISCONNECTED
  dcache: d_splice_alias should ignore DCACHE_DISCONNECTED
  dcache: d_splice_alias mustn't create directory aliases
  dcache: close d_move race in d_splice_alias
  dcache: move d_splice_alias
  namei: trivial fix to vfs_rename_dir comment
  VFS: allow -&gt;d_manage() to declare -EISDIR in rcu_walk mode.
  cifs: support RENAME_NOREPLACE
  hostfs: support rename flags
  shmem: support RENAME_EXCHANGE
  shmem: support RENAME_NOREPLACE
  btrfs: add RENAME_NOREPLACE
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: call rename2 if exists</title>
<updated>2014-08-07T18:40:09+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@suse.cz</email>
</author>
<published>2014-07-23T13:15:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7177a9c4b509eb357cc450256bc3cf39f1a1e639'/>
<id>7177a9c4b509eb357cc450256bc3cf39f1a1e639</id>
<content type='text'>
Christoph Hellwig suggests:

1) make vfs_rename call -&gt;rename2 if it exists instead of -&gt;rename
2) switch all filesystems that you're adding NOREPLACE support for to
   use -&gt;rename2
3) see how many -&gt;rename instances we'll have left after a few
   iterations of 2.

Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Christoph Hellwig suggests:

1) make vfs_rename call -&gt;rename2 if it exists instead of -&gt;rename
2) switch all filesystems that you're adding NOREPLACE support for to
   use -&gt;rename2
3) see how many -&gt;rename instances we'll have left after a few
   iterations of 2.

Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct</title>
<updated>2014-07-31T02:17:17+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2014-07-31T02:17:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=86f0afd463215fc3e58020493482faa4ac3a4d69'/>
<id>86f0afd463215fc3e58020493482faa4ac3a4d69</id>
<content type='text'>
If there is a failure while allocating the preallocation structure, a
number of blocks can end up getting marked in the in-memory buddy
bitmap, and then not getting released.  This can result in the
following corruption getting reported by the kernel:

EXT4-fs error (device sda3): ext4_mb_generate_buddy:758: group 1126,
12793 clusters in bitmap, 12729 in gd

In that case, we need to release the blocks using mb_free_blocks().

Tested: fs smoke test; also demonstrated that with injected errors,
	the file system is no longer getting corrupted

Google-Bug-Id: 16657874

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: stable@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If there is a failure while allocating the preallocation structure, a
number of blocks can end up getting marked in the in-memory buddy
bitmap, and then not getting released.  This can result in the
following corruption getting reported by the kernel:

EXT4-fs error (device sda3): ext4_mb_generate_buddy:758: group 1126,
12793 clusters in bitmap, 12729 in gd

In that case, we need to release the blocks using mb_free_blocks().

Tested: fs smoke test; also demonstrated that with injected errors,
	the file system is no longer getting corrupted

Google-Bug-Id: 16657874

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: stable@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix COLLAPSE RANGE test for bigalloc file systems</title>
<updated>2014-07-29T14:45:31+00:00</updated>
<author>
<name>Namjae Jeon</name>
<email>namjae.jeon@samsung.com</email>
</author>
<published>2014-07-29T14:45:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ee98fa3a8b148a234600a20f7cdc2b4b37f38083'/>
<id>ee98fa3a8b148a234600a20f7cdc2b4b37f38083</id>
<content type='text'>
Blocks in collapse range should be collapsed per cluster unit when
bigalloc is enable. If bigalloc is not enable, EXT4_CLUSTER_SIZE will
be same with EXT4_BLOCK_SIZE.

With this bug fixed, patch enables COLLAPSE_RANGE for bigalloc, which
fixes a large number of xfstest failures which use fsx.

Signed-off-by: Namjae Jeon &lt;namjae.jeon@samsung.com&gt;
Signed-off-by: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Blocks in collapse range should be collapsed per cluster unit when
bigalloc is enable. If bigalloc is not enable, EXT4_CLUSTER_SIZE will
be same with EXT4_BLOCK_SIZE.

With this bug fixed, patch enables COLLAPSE_RANGE for bigalloc, which
fixes a large number of xfstest failures which use fsx.

Signed-off-by: Namjae Jeon &lt;namjae.jeon@samsung.com&gt;
Signed-off-by: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: check inline directory before converting</title>
<updated>2014-07-28T17:06:26+00:00</updated>
<author>
<name>Darrick J. Wong</name>
<email>darrick.wong@oracle.com</email>
</author>
<published>2014-07-28T17:06:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=40b163f1c45f52752677e66d2fd273dbfd273a22'/>
<id>40b163f1c45f52752677e66d2fd273dbfd273a22</id>
<content type='text'>
Before converting an inline directory to a regular directory, check
the directory entries to make sure they're not obviously broken.
This helps us to avoid a BUG_ON if one of the dirents is trashed.

Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Before converting an inline directory to a regular directory, check
the directory entries to make sure they're not obviously broken.
This helps us to avoid a BUG_ON if one of the dirents is trashed.

Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix incorrect locking in move_extent_per_page</title>
<updated>2014-07-28T02:32:27+00:00</updated>
<author>
<name>Dmitry Monakhov</name>
<email>dmonakhov@openvz.org</email>
</author>
<published>2014-07-28T02:32:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6e2631463f3a2ce430a295c68aead3ff228ca3cf'/>
<id>6e2631463f3a2ce430a295c68aead3ff228ca3cf</id>
<content type='text'>
If we have to copy data we must drop i_data_sem because of
get_blocks() will be called inside mext_page_mkuptodate(), but later we must
reacquire it again because we are about to change extent's tree

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@openvz.org&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we have to copy data we must drop i_data_sem because of
get_blocks() will be called inside mext_page_mkuptodate(), but later we must
reacquire it again because we are about to change extent's tree

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@openvz.org&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: use correct depth value</title>
<updated>2014-07-28T02:30:29+00:00</updated>
<author>
<name>Dmitry Monakhov</name>
<email>dmonakhov@openvz.org</email>
</author>
<published>2014-07-28T02:30:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=29faed1638e6b2ace499e1b848cc3337ef6a6ad5'/>
<id>29faed1638e6b2ace499e1b848cc3337ef6a6ad5</id>
<content type='text'>
Inode's depth can be changed from here:
ext4_ext_try_to_merge() -&gt;ext4_ext_try_to_merge_up()
We must use correct value.

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@openvz.org&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Inode's depth can be changed from here:
ext4_ext_try_to_merge() -&gt;ext4_ext_try_to_merge_up()
We must use correct value.

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@openvz.org&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: add i_data_sem sanity check</title>
<updated>2014-07-28T02:28:15+00:00</updated>
<author>
<name>Dmitry Monakhov</name>
<email>dmonakhov@openvz.org</email>
</author>
<published>2014-07-28T02:28:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4b1f1660710c2ddcefc08c289e66a32b6bc2a2e3'/>
<id>4b1f1660710c2ddcefc08c289e66a32b6bc2a2e3</id>
<content type='text'>
Each caller of ext4_ext_dirty must hold i_data_sem,
The only exception is migration code, let's make it convenient.

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@openvz.org&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Each caller of ext4_ext_dirty must hold i_data_sem,
The only exception is migration code, let's make it convenient.

Signed-off-by: Dmitry Monakhov &lt;dmonakhov@openvz.org&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix wrong size computation in ext4_mb_normalize_request()</title>
<updated>2014-07-28T02:26:36+00:00</updated>
<author>
<name>Xiaoguang Wang</name>
<email>wangxg.fnst@cn.fujitsu.com</email>
</author>
<published>2014-07-28T02:26:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b27b1535acc0e97b7435bccf205b3fe9c782afac'/>
<id>b27b1535acc0e97b7435bccf205b3fe9c782afac</id>
<content type='text'>
As the member fe_len defined in struct ext4_free_extent is expressed as
number of clusters, the variable "size" computation is wrong, we need to
first translate fe_len to block number, then to bytes.

Signed-off-by: Xiaoguang Wang &lt;wangxg.fnst@cn.fujitsu.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As the member fe_len defined in struct ext4_free_extent is expressed as
number of clusters, the variable "size" computation is wrong, we need to
first translate fe_len to block number, then to bytes.

Signed-off-by: Xiaoguang Wang &lt;wangxg.fnst@cn.fujitsu.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: make ext4_has_inline_data() as a inline function</title>
<updated>2014-07-15T14:10:04+00:00</updated>
<author>
<name>Zheng Liu</name>
<email>wenqing.lz@taobao.com</email>
</author>
<published>2014-07-15T14:10:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=83447ccb4df6e6051027bedff2a8cfc6a52ebcf8'/>
<id>83447ccb4df6e6051027bedff2a8cfc6a52ebcf8</id>
<content type='text'>
Now ext4_has_inline_data() is used in wide spread codepaths.  So we need
to make it as a inline function to avoid burning some CPU cycles.

Change in text size:

         text     data      bss     dec     hex filename
before: 326110    19258    5528  350896   55ab0 fs/ext4/ext4.o
after:  326227    19258    5528  351013   55b25 fs/ext4/ext4.o

I use the following script to measure the CPU usage.

  #!/bin/bash

  shm_base='/dev/shm'
  img=${shm_base}/ext4-img
  mnt=/mnt/loop

  e2fsprgs_base=$HOME/e2fsprogs
  mkfs=${e2fsprgs_base}/misc/mke2fs
  fsck=${e2fsprgs_base}/e2fsck/e2fsck

  sudo umount $mnt
  dd if=/dev/zero of=$img bs=4k count=3145728
  ${mkfs} -t ext4 -O inline_data -F $img
  sudo mount -t ext4 -o loop $img $mnt

  # start testing...
  testdir="${mnt}/testdir"
  mkdir $testdir
  cd $testdir

  echo "start testing..."
  for ((cnt=0;cnt&lt;100;cnt++)); do

  for ((i=0;i&lt;5;i++)); do
  	for ((j=0;j&lt;5;j++)); do
  		for ((k=0;k&lt;5;k++)); do
  			for ((l=0;l&lt;5;l++)); do
  				mkdir -p $i/$j/$k/$l
  				echo "$i-$j-$k-$l" &gt; $i/$j/$k/$l/testfile
  			done
  		done
  	done
  done

  ls -R $testdir &gt; /dev/null
  rm -rf $testdir/*

  done

The result of `perf top -G -U` is as below.

vanilla:
 13.92%  [ext4]  [k] ext4_do_update_inode
  9.36%  [ext4]  [k] __ext4_get_inode_loc
  4.07%  [ext4]  [k] ftrace_define_fields_ext4_writepages
  3.83%  [ext4]  [k] __ext4_handle_dirty_metadata
  3.42%  [ext4]  [k] ext4_get_inode_flags
  2.71%  [ext4]  [k] ext4_mark_iloc_dirty
  2.46%  [ext4]  [k] ftrace_define_fields_ext4_direct_IO_enter
  2.26%  [ext4]  [k] ext4_get_inode_loc
  2.22%  [ext4]  [k] ext4_has_inline_data
  [...]

After applied the patch, we don't see ext4_has_inline_data() because it
has been inlined and perf couldn't sample it.  Although it doesn't mean
that the CPU cycles can be saved but at least the overhead of function
calls can be eliminated.  So IMHO we'd better inline this function.

Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Signed-off-by: Zheng Liu &lt;wenqing.lz@taobao.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now ext4_has_inline_data() is used in wide spread codepaths.  So we need
to make it as a inline function to avoid burning some CPU cycles.

Change in text size:

         text     data      bss     dec     hex filename
before: 326110    19258    5528  350896   55ab0 fs/ext4/ext4.o
after:  326227    19258    5528  351013   55b25 fs/ext4/ext4.o

I use the following script to measure the CPU usage.

  #!/bin/bash

  shm_base='/dev/shm'
  img=${shm_base}/ext4-img
  mnt=/mnt/loop

  e2fsprgs_base=$HOME/e2fsprogs
  mkfs=${e2fsprgs_base}/misc/mke2fs
  fsck=${e2fsprgs_base}/e2fsck/e2fsck

  sudo umount $mnt
  dd if=/dev/zero of=$img bs=4k count=3145728
  ${mkfs} -t ext4 -O inline_data -F $img
  sudo mount -t ext4 -o loop $img $mnt

  # start testing...
  testdir="${mnt}/testdir"
  mkdir $testdir
  cd $testdir

  echo "start testing..."
  for ((cnt=0;cnt&lt;100;cnt++)); do

  for ((i=0;i&lt;5;i++)); do
  	for ((j=0;j&lt;5;j++)); do
  		for ((k=0;k&lt;5;k++)); do
  			for ((l=0;l&lt;5;l++)); do
  				mkdir -p $i/$j/$k/$l
  				echo "$i-$j-$k-$l" &gt; $i/$j/$k/$l/testfile
  			done
  		done
  	done
  done

  ls -R $testdir &gt; /dev/null
  rm -rf $testdir/*

  done

The result of `perf top -G -U` is as below.

vanilla:
 13.92%  [ext4]  [k] ext4_do_update_inode
  9.36%  [ext4]  [k] __ext4_get_inode_loc
  4.07%  [ext4]  [k] ftrace_define_fields_ext4_writepages
  3.83%  [ext4]  [k] __ext4_handle_dirty_metadata
  3.42%  [ext4]  [k] ext4_get_inode_flags
  2.71%  [ext4]  [k] ext4_mark_iloc_dirty
  2.46%  [ext4]  [k] ftrace_define_fields_ext4_direct_IO_enter
  2.26%  [ext4]  [k] ext4_get_inode_loc
  2.22%  [ext4]  [k] ext4_has_inline_data
  [...]

After applied the patch, we don't see ext4_has_inline_data() because it
has been inlined and perf couldn't sample it.  Although it doesn't mean
that the CPU cycles can be saved but at least the overhead of function
calls can be eliminated.  So IMHO we'd better inline this function.

Cc: Andreas Dilger &lt;adilger.kernel@dilger.ca&gt;
Signed-off-by: Zheng Liu &lt;wenqing.lz@taobao.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
</feed>
