<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/bcache, branch v3.13.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>bcache: Data corruption fix</title>
<updated>2014-02-06T19:34:06+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-12-18T01:51:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9047ec62665cb8931840214b038cc1d22c4ed2e4'/>
<id>9047ec62665cb8931840214b038cc1d22c4ed2e4</id>
<content type='text'>
commit ef71ec00002d92a08eb27e9d036e3d48835b6597 upstream.

The code that handles overlapping extents that we've just read back in from disk
was depending on the behaviour of the code that handles overlapping extents as
we're inserting into a btree node in the case of an insert that forced an
existing extent to be split: on insert, if we had to split we'd also insert a
new extent to represent the top part of the old extent - and then that new
extent would get written out.

The code that read the extents back in thus not bother with splitting extents -
if it saw an extent that ovelapped in the middle of an older extent, it would
trim the old extent to only represent the bottom part, assuming that the
original insert would've inserted a new extent to represent the top part.

I still haven't figured out _how_ it can happen, but I'm now pretty convinced
(and testing has confirmed) that there's some kind of an obscure corner case
(probably involving extent merging, and multiple overwrites in different sets)
that breaks this. The fix is to change the mergesort fixup code to split extents
itself when required.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ef71ec00002d92a08eb27e9d036e3d48835b6597 upstream.

The code that handles overlapping extents that we've just read back in from disk
was depending on the behaviour of the code that handles overlapping extents as
we're inserting into a btree node in the case of an insert that forced an
existing extent to be split: on insert, if we had to split we'd also insert a
new extent to represent the top part of the old extent - and then that new
extent would get written out.

The code that read the extents back in thus not bother with splitting extents -
if it saw an extent that ovelapped in the middle of an older extent, it would
trim the old extent to only represent the bottom part, assuming that the
original insert would've inserted a new extent to represent the top part.

I still haven't figured out _how_ it can happen, but I'm now pretty convinced
(and testing has confirmed) that there's some kind of an obscure corner case
(probably involving extent merging, and multiple overwrites in different sets)
that breaks this. The fix is to change the mergesort fixup code to split extents
itself when required.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: New writeback PD controller</title>
<updated>2013-12-16T22:22:59+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-11-11T21:58:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=16749c23c00c686ed168471963e3ddb0f3fcd855'/>
<id>16749c23c00c686ed168471963e3ddb0f3fcd855</id>
<content type='text'>
The old writeback PD controller could get into states where it had throttled all
the way down and take way too long to recover - it was too complicated to really
understand what it was doing.

This rewrites a good chunk of it to hopefully be simpler and make more sense,
and it also pays more attention to units which should make the behaviour a bit
easier to understand.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The old writeback PD controller could get into states where it had throttled all
the way down and take way too long to recover - it was too complicated to really
understand what it was doing.

This rewrites a good chunk of it to hopefully be simpler and make more sense,
and it also pays more attention to units which should make the behaviour a bit
easier to understand.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: bugfix for race between moving_gc and bucket_invalidate</title>
<updated>2013-12-16T22:22:58+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-12-16T22:12:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6d3d1a9c542b19dff1c7d7c8354d0869e4655287'/>
<id>6d3d1a9c542b19dff1c7d7c8354d0869e4655287</id>
<content type='text'>
There is a possibility for a bucket to be invalidated by the allocator
while moving_gc was copying it's contents to another bucket, if the
bucket only held cached data. To prevent this moving checks for
a stale ptr (to an invalidated bucket), before and after reads.
It it finds one, it simply ignores moving that data. This only
affects bcache if the moving_gc was turned on, note that it's
off by default.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a possibility for a bucket to be invalidated by the allocator
while moving_gc was copying it's contents to another bucket, if the
bucket only held cached data. To prevent this moving checks for
a stale ptr (to an invalidated bucket), before and after reads.
It it finds one, it simply ignores moving that data. This only
affects bcache if the moving_gc was turned on, note that it's
off by default.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: fix for gc and writeback race</title>
<updated>2013-12-16T22:22:58+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-11-27T03:14:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bf0a628a95dba7f983b6047cea695fb066fb2512'/>
<id>bf0a628a95dba7f983b6047cea695fb066fb2512</id>
<content type='text'>
Garbage collector needs to check keys in the writeback keybuf to
make sure it's not invalidating buckets to which the writeback
keys point to.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Garbage collector needs to check keys in the writeback keybuf to
make sure it's not invalidating buckets to which the writeback
keys point to.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: bugfix - moving_gc now moves only correct buckets</title>
<updated>2013-12-16T22:22:58+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-11-08T01:53:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=981aa8c091e164ea51dd1e81b71a1f3852bbcceb'/>
<id>981aa8c091e164ea51dd1e81b71a1f3852bbcceb</id>
<content type='text'>
Removed gc_move_threshold because picking buckets only by
threshold could lead moving extra buckets (ei. if there are
buckets at the threshold that aren't supposed to be moved
do to space considerations).

This is replaced by a GC_MOVE bit in the gc_mark bitmask.
Now only marked buckets get moved.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Removed gc_move_threshold because picking buckets only by
threshold could lead moving extra buckets (ei. if there are
buckets at the threshold that aren't supposed to be moved
do to space considerations).

This is replaced by a GC_MOVE bit in the gc_mark bitmask.
Now only marked buckets get moved.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: fix for gc crashing when no sectors are used</title>
<updated>2013-12-16T22:22:57+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-11-01T02:25:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bee63f40cb5f5e8ab2abfbc85acde99cc0acd4b5'/>
<id>bee63f40cb5f5e8ab2abfbc85acde99cc0acd4b5</id>
<content type='text'>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Fix heap_peek() macro</title>
<updated>2013-12-16T22:22:57+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-10-24T00:35:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=97d11a660fd906dbea3dccd2638495d8497c3c81'/>
<id>97d11a660fd906dbea3dccd2638495d8497c3c81</id>
<content type='text'>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Fix for can_attach_cache()</title>
<updated>2013-12-16T22:22:57+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-10-22T20:19:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9eb8ebeb2471b8cbaa078066aeb85fe5d46f5304'/>
<id>9eb8ebeb2471b8cbaa078066aeb85fe5d46f5304</id>
<content type='text'>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Fix dirty_data accounting</title>
<updated>2013-12-16T22:22:16+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-11-11T05:55:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d24a6e1087030b6da286df9433add5fa2f21b83b'/>
<id>d24a6e1087030b6da286df9433add5fa2f21b83b</id>
<content type='text'>
Dirty data accounting wasn't quite right - firstly, we were adding the key we're
inserting after it could have merged with another dirty key already in the
btree, and secondly we could sometimes pass the wrong offset to
bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is
important when tracking dirty data by stripe.

NOTE FOR BACKPORTERS: For 3.10 (and 3.11?) there's other accounting fixes
necessary that got squashed in with other patches; the full patch against 3.10
is 408cc2f47eeac93a, available at:
  git://evilpiepirate.org/~kent/linux-bcache.git bcache-3.10-writeback-fixes

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Cc: linux-stable &lt;stable@vger.kernel.org&gt; # &gt;= v3.10

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 2a46036..4a12b2f 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1817,7 +1817,8 @@ static bool fix_overlapping_extents(struct btree *b, struct bkey *insert,
 			if (KEY_START(k) &gt; KEY_START(insert) + sectors_found)
 				goto check_failed;

-			if (KEY_PTRS(replace_key) != KEY_PTRS(k))
+			if (KEY_PTRS(k) != KEY_PTRS(replace_key) ||
+			    KEY_DIRTY(k) != KEY_DIRTY(replace_key))
 				goto check_failed;

 			/* skip past gen */
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Dirty data accounting wasn't quite right - firstly, we were adding the key we're
inserting after it could have merged with another dirty key already in the
btree, and secondly we could sometimes pass the wrong offset to
bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is
important when tracking dirty data by stripe.

NOTE FOR BACKPORTERS: For 3.10 (and 3.11?) there's other accounting fixes
necessary that got squashed in with other patches; the full patch against 3.10
is 408cc2f47eeac93a, available at:
  git://evilpiepirate.org/~kent/linux-bcache.git bcache-3.10-writeback-fixes

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Cc: linux-stable &lt;stable@vger.kernel.org&gt; # &gt;= v3.10

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 2a46036..4a12b2f 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1817,7 +1817,8 @@ static bool fix_overlapping_extents(struct btree *b, struct bkey *insert,
 			if (KEY_START(k) &gt; KEY_START(insert) + sectors_found)
 				goto check_failed;

-			if (KEY_PTRS(replace_key) != KEY_PTRS(k))
+			if (KEY_PTRS(k) != KEY_PTRS(replace_key) ||
+			    KEY_DIRTY(k) != KEY_DIRTY(replace_key))
 				goto check_failed;

 			/* skip past gen */
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Use uninterruptible sleep in writeback</title>
<updated>2013-12-16T22:04:57+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-11-29T01:28:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ce2b3f595e1c56639085645e0130426e443008c0'/>
<id>ce2b3f595e1c56639085645e0130426e443008c0</id>
<content type='text'>
We're just waiting on kthread_should_stop(), nothing else, so
interruptible sleep was wrong here.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We're just waiting on kthread_should_stop(), nothing else, so
interruptible sleep was wrong here.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
