<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/btrfs/ordered-data.c, branch v3.8</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Btrfs: fix possible stale data exposure</title>
<updated>2013-02-05T21:09:16+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>jbacik@fusionio.com</email>
</author>
<published>2013-01-30T19:31:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=59fe4f41976f6331b695ff049296d082cf621823'/>
<id>59fe4f41976f6331b695ff049296d082cf621823</id>
<content type='text'>
We specifically do not update the disk i_size if there are ordered extents
outstanding for any area between the current disk_i_size and our ordered
extent so that we do not expose stale data.  The problem is the check we
have only checks if the ordered extent starts at or after the current
disk_i_size, which doesn't take into account an ordered extent that starts
before the current disk_i_size and ends past the disk_i_size.  Fix this by
checking if the extent ends past the disk_i_size.  Thanks,

Signed-off-by: Josef Bacik &lt;jbacik@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We specifically do not update the disk i_size if there are ordered extents
outstanding for any area between the current disk_i_size and our ordered
extent so that we do not expose stale data.  The problem is the check we
have only checks if the ordered extent starts at or after the current
disk_i_size, which doesn't take into account an ordered extent that starts
before the current disk_i_size and ends past the disk_i_size.  Fix this by
checking if the extent ends past the disk_i_size.  Thanks,

Signed-off-by: Josef Bacik &lt;jbacik@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: fix missing i_size update</title>
<updated>2013-02-05T21:09:14+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>jbacik@fusionio.com</email>
</author>
<published>2013-01-30T19:17:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5d1f40202bad12d4c70a2d40a420b30d23a72b1a'/>
<id>5d1f40202bad12d4c70a2d40a420b30d23a72b1a</id>
<content type='text'>
If we have an ordered extent before the ordered extent we are currently
completing that is after the current disk_i_size we will put our i_size
update into that ordered extent so that we do not expose stale data.  The
problem is that if our disk i_size is updated past the previous ordered
extent we won't update the i_size with the pending i_size update.  So check
the pending i_size update and if its above the current disk i_size we need
to go ahead and try to update.  Thanks,

Signed-off-by: Josef Bacik &lt;jbacik@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If we have an ordered extent before the ordered extent we are currently
completing that is after the current disk_i_size we will put our i_size
update into that ordered extent so that we do not expose stale data.  The
problem is that if our disk i_size is updated past the previous ordered
extent we won't update the i_size with the pending i_size update.  So check
the pending i_size update and if its above the current disk i_size we need
to go ahead and try to update.  Thanks,

Signed-off-by: Josef Bacik &lt;jbacik@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: cleanup for btrfs_wait_order_range</title>
<updated>2012-12-12T22:15:19+00:00</updated>
<author>
<name>Liu Bo</name>
<email>bo.li.liu@oracle.com</email>
</author>
<published>2012-11-01T06:38:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4fde183d8c755f8a8bdffcb03a8d947e62ccea6a'/>
<id>4fde183d8c755f8a8bdffcb03a8d947e62ccea6a</id>
<content type='text'>
Variable 'found' is no more used.

Signed-off-by: Liu Bo &lt;bo.li.liu@oracle.com&gt;
Signed-off-by: Chris Mason &lt;chris.mason@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Variable 'found' is no more used.

Signed-off-by: Liu Bo &lt;bo.li.liu@oracle.com&gt;
Signed-off-by: Chris Mason &lt;chris.mason@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: make ordered extent be flushed by multi-task</title>
<updated>2012-12-11T18:31:38+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2012-10-25T09:41:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9afab8820bb8b55af669b199597d6716e04d1ba8'/>
<id>9afab8820bb8b55af669b199597d6716e04d1ba8</id>
<content type='text'>
Though the process of the ordered extents is a bit different with the delalloc inode
flush, but we can see it as a subset of the delalloc inode flush, so we also handle
them by flush workers.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;chris.mason@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Though the process of the ordered extents is a bit different with the delalloc inode
flush, but we can see it as a subset of the delalloc inode flush, so we also handle
them by flush workers.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;chris.mason@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: make ordered operations be handled by multi-task</title>
<updated>2012-12-11T18:31:37+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2012-10-25T09:31:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=25287e0a16c0ad068aa89ab01aea6c699b31ec12'/>
<id>25287e0a16c0ad068aa89ab01aea6c699b31ec12</id>
<content type='text'>
The process of the ordered operations is similar to the delalloc inode flush, so
we handle them by flush workers.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;chris.mason@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The process of the ordered operations is similar to the delalloc inode flush, so
we handle them by flush workers.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Signed-off-by: Chris Mason &lt;chris.mason@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: kill obsolete arguments in btrfs_wait_ordered_extents</title>
<updated>2012-10-04T13:39:57+00:00</updated>
<author>
<name>Liu Bo</name>
<email>bo.li.liu@oracle.com</email>
</author>
<published>2012-09-14T08:58:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6bbe3a9c805fcb8cd8d396dafd32078181a7cdd5'/>
<id>6bbe3a9c805fcb8cd8d396dafd32078181a7cdd5</id>
<content type='text'>
nocow_only is now an obsolete argument.

Signed-off-by: Liu Bo &lt;bo.li.liu@oracle.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
nocow_only is now an obsolete argument.

Signed-off-by: Liu Bo &lt;bo.li.liu@oracle.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: use a slab for ordered extents allocation</title>
<updated>2012-10-01T19:19:11+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2012-09-06T10:01:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6352b91da1a2108bb8cc5115e8714f90d706f15f'/>
<id>6352b91da1a2108bb8cc5115e8714f90d706f15f</id>
<content type='text'>
The ordered extent allocation is in the fast path of the IO, so use a slab
to improve the speed of the allocation.

 "Size of the struct is 280, so this will fall into the size-512 bucket,
  giving 8 objects per page, while own slab will pack 14 objects into a page.

  Another benefit I see is to check for leaked objects when the module is
  removed (and the cache destroy takes place)."
						-- David Sterba

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The ordered extent allocation is in the fast path of the IO, so use a slab
to improve the speed of the allocation.

 "Size of the struct is 280, so this will fall into the size-512 bucket,
  giving 8 objects per page, while own slab will pack 14 objects into a page.

  Another benefit I see is to check for leaked objects when the module is
  removed (and the cache destroy takes place)."
						-- David Sterba

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: fix file extent discount problem in the, snapshot</title>
<updated>2012-10-01T19:19:10+00:00</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2012-09-06T10:01:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b9a8cc5bef963b76c5b6c3016b7e91988a3e758b'/>
<id>b9a8cc5bef963b76c5b6c3016b7e91988a3e758b</id>
<content type='text'>
If a snapshot is created while we are writing some data into the file,
the i_size of the corresponding file in the snapshot will be wrong, it will
be beyond the end of the last file extent. And btrfsck will report:
  root 256 inode 257 errors 100

Steps to reproduce:
 # mkfs.btrfs &lt;partition&gt;
 # mount &lt;partition&gt; &lt;mnt&gt;
 # cd &lt;mnt&gt;
 # dd if=/dev/zero of=tmpfile bs=4M count=1024 &amp;
 # for ((i=0; i&lt;4; i++))
 &gt; do
 &gt; btrfs sub snap . $i
 &gt; done

This because the algorithm of disk_i_size update is wrong. Though there are
some ordered extents behind the current one which we use to update disk_i_size,
it doesn't mean those extents will be dealt with in the same transaction. So
We shouldn't use the offset of those extents to update disk_i_size. Or we will
get the wrong i_size in the snapshot.

We fix this problem by recording the max real i_size. If we find there is a
ordered extent which is in front of the current one and doesn't complete, we
will record the end of the current one into that ordered extent. Surely, if
the current extent holds the end of other extent(it must be greater than
the current one because it is behind the current one), we will record the
number that the current extent holds. In this way, we can exclude the ordered
extents that may not be dealth with in the same transaction, and be easy to
know the real disk_i_size.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If a snapshot is created while we are writing some data into the file,
the i_size of the corresponding file in the snapshot will be wrong, it will
be beyond the end of the last file extent. And btrfsck will report:
  root 256 inode 257 errors 100

Steps to reproduce:
 # mkfs.btrfs &lt;partition&gt;
 # mount &lt;partition&gt; &lt;mnt&gt;
 # cd &lt;mnt&gt;
 # dd if=/dev/zero of=tmpfile bs=4M count=1024 &amp;
 # for ((i=0; i&lt;4; i++))
 &gt; do
 &gt; btrfs sub snap . $i
 &gt; done

This because the algorithm of disk_i_size update is wrong. Though there are
some ordered extents behind the current one which we use to update disk_i_size,
it doesn't mean those extents will be dealt with in the same transaction. So
We shouldn't use the offset of those extents to update disk_i_size. Or we will
get the wrong i_size in the snapshot.

We fix this problem by recording the max real i_size. If we find there is a
ordered extent which is in front of the current one and doesn't complete, we
will record the end of the current one into that ordered extent. Surely, if
the current extent holds the end of other extent(it must be greater than
the current one because it is behind the current one), we will record the
number that the current extent holds. In this way, we can exclude the ordered
extents that may not be dealth with in the same transaction, and be easy to
know the real disk_i_size.

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>btrfs: nuke pdflush from comments</title>
<updated>2012-08-04T08:15:35+00:00</updated>
<author>
<name>Artem Bityutskiy</name>
<email>artem.bityutskiy@linux.intel.com</email>
</author>
<published>2012-07-25T15:12:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b257031408945eb89980e14cb79d5fd854d8f25f'/>
<id>b257031408945eb89980e14cb79d5fd854d8f25f</id>
<content type='text'>
The pdflush thread is long gone, so this patch removes references to pdflush
from btrfs comments.

Cc: Chris Mason &lt;chris.mason@fusionio.com&gt;
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The pdflush thread is long gone, so this patch removes references to pdflush
from btrfs comments.

Cc: Chris Mason &lt;chris.mason@fusionio.com&gt;
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Btrfs: call filemap_fdatawrite twice for compression</title>
<updated>2012-06-15T01:30:54+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@redhat.com</email>
</author>
<published>2012-06-08T19:26:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7ddf5a42d311d74fd9f7373cb56def0843c219f8'/>
<id>7ddf5a42d311d74fd9f7373cb56def0843c219f8</id>
<content type='text'>
I removed this in an earlier commit and I was wrong.  Because compression
can return from filemap_fdatawrite() without having actually set any of it's
pages as writeback() it can make filemap_fdatawait() do essentially nothing,
and then we won't find any ordered extents because they may not have been
created yet.  So not only does this make fsync() completely useless, but it
will also screw up if you truncate on a non-page aligned offset since we
zero out the end and then wait on ordered extents and then call drop caches.
We can drop the cache before the io completes and then we try to unpin the
extent we just wrote we won't find it and everything goes sideways.  So fix
this by putting it back and put a giant comment there to keep me from trying
to remove it in the future.  Thanks,

Signed-off-by: Josef Bacik &lt;josef@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I removed this in an earlier commit and I was wrong.  Because compression
can return from filemap_fdatawrite() without having actually set any of it's
pages as writeback() it can make filemap_fdatawait() do essentially nothing,
and then we won't find any ordered extents because they may not have been
created yet.  So not only does this make fsync() completely useless, but it
will also screw up if you truncate on a non-page aligned offset since we
zero out the end and then wait on ordered extents and then call drop caches.
We can drop the cache before the io completes and then we try to unpin the
extent we just wrote we won't find it and everything goes sideways.  So fix
this by putting it back and put a giant comment there to keep me from trying
to remove it in the future.  Thanks,

Signed-off-by: Josef Bacik &lt;josef@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
