<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/bcache, branch linux-3.13.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>bcache: fix BUG_ON due to integer overflow with GC_SECTORS_USED</title>
<updated>2014-02-20T19:10:11+00:00</updated>
<author>
<name>Darrick J. Wong</name>
<email>darrick.wong@oracle.com</email>
</author>
<published>2014-01-29T00:57:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b8bb5472da4bf66da971acb7a923f8a7e58420a7'/>
<id>b8bb5472da4bf66da971acb7a923f8a7e58420a7</id>
<content type='text'>
commit 947174476701fbc84ea8c7ec9664270f9d80b076 upstream.

The BUG_ON at the end of __bch_btree_mark_key can be triggered due to
an integer overflow error:

BITMASK(GC_SECTORS_USED, struct bucket, gc_mark, 2, 13);
...
SET_GC_SECTORS_USED(g, min_t(unsigned,
	     GC_SECTORS_USED(g) + KEY_SIZE(k),
	     (1 &lt;&lt; 14) - 1));
BUG_ON(!GC_SECTORS_USED(g));

In bcache.h, the SECTORS_USED bitfield is defined to be 13 bits wide.
While the SET_ code tries to ensure that the field doesn't overflow by
clamping it to (1&lt;&lt;14)-1 == 16383, this is incorrect because 16383
requires 14 bits.  Therefore, if GC_SECTORS_USED() + KEY_SIZE() =
8192, the SET_ statement tries to store 8192 into a 13-bit field.  In
a 13-bit field, 8192 becomes zero, thus triggering the BUG_ON.

Therefore, create a field width constant and a max value constant, and
use those to create the bitfield and check the inputs to
SET_GC_SECTORS_USED.  Arguably the BITMASK() template ought to have
BUG_ON checks for too-large values, but that's a separate patch.

Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Cc: Kent Overstreet &lt;kmo@daterainc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 947174476701fbc84ea8c7ec9664270f9d80b076 upstream.

The BUG_ON at the end of __bch_btree_mark_key can be triggered due to
an integer overflow error:

BITMASK(GC_SECTORS_USED, struct bucket, gc_mark, 2, 13);
...
SET_GC_SECTORS_USED(g, min_t(unsigned,
	     GC_SECTORS_USED(g) + KEY_SIZE(k),
	     (1 &lt;&lt; 14) - 1));
BUG_ON(!GC_SECTORS_USED(g));

In bcache.h, the SECTORS_USED bitfield is defined to be 13 bits wide.
While the SET_ code tries to ensure that the field doesn't overflow by
clamping it to (1&lt;&lt;14)-1 == 16383, this is incorrect because 16383
requires 14 bits.  Therefore, if GC_SECTORS_USED() + KEY_SIZE() =
8192, the SET_ statement tries to store 8192 into a 13-bit field.  In
a 13-bit field, 8192 becomes zero, thus triggering the BUG_ON.

Therefore, create a field width constant and a max value constant, and
use those to create the bitfield and check the inputs to
SET_GC_SECTORS_USED.  Arguably the BITMASK() template ought to have
BUG_ON checks for too-large values, but that's a separate patch.

Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Cc: Kent Overstreet &lt;kmo@daterainc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Data corruption fix</title>
<updated>2014-02-06T19:34:06+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-12-18T01:51:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9047ec62665cb8931840214b038cc1d22c4ed2e4'/>
<id>9047ec62665cb8931840214b038cc1d22c4ed2e4</id>
<content type='text'>
commit ef71ec00002d92a08eb27e9d036e3d48835b6597 upstream.

The code that handles overlapping extents that we've just read back in from disk
was depending on the behaviour of the code that handles overlapping extents as
we're inserting into a btree node in the case of an insert that forced an
existing extent to be split: on insert, if we had to split we'd also insert a
new extent to represent the top part of the old extent - and then that new
extent would get written out.

The code that read the extents back in thus not bother with splitting extents -
if it saw an extent that ovelapped in the middle of an older extent, it would
trim the old extent to only represent the bottom part, assuming that the
original insert would've inserted a new extent to represent the top part.

I still haven't figured out _how_ it can happen, but I'm now pretty convinced
(and testing has confirmed) that there's some kind of an obscure corner case
(probably involving extent merging, and multiple overwrites in different sets)
that breaks this. The fix is to change the mergesort fixup code to split extents
itself when required.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ef71ec00002d92a08eb27e9d036e3d48835b6597 upstream.

The code that handles overlapping extents that we've just read back in from disk
was depending on the behaviour of the code that handles overlapping extents as
we're inserting into a btree node in the case of an insert that forced an
existing extent to be split: on insert, if we had to split we'd also insert a
new extent to represent the top part of the old extent - and then that new
extent would get written out.

The code that read the extents back in thus not bother with splitting extents -
if it saw an extent that ovelapped in the middle of an older extent, it would
trim the old extent to only represent the bottom part, assuming that the
original insert would've inserted a new extent to represent the top part.

I still haven't figured out _how_ it can happen, but I'm now pretty convinced
(and testing has confirmed) that there's some kind of an obscure corner case
(probably involving extent merging, and multiple overwrites in different sets)
that breaks this. The fix is to change the mergesort fixup code to split extents
itself when required.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: New writeback PD controller</title>
<updated>2013-12-16T22:22:59+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-11-11T21:58:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=16749c23c00c686ed168471963e3ddb0f3fcd855'/>
<id>16749c23c00c686ed168471963e3ddb0f3fcd855</id>
<content type='text'>
The old writeback PD controller could get into states where it had throttled all
the way down and take way too long to recover - it was too complicated to really
understand what it was doing.

This rewrites a good chunk of it to hopefully be simpler and make more sense,
and it also pays more attention to units which should make the behaviour a bit
easier to understand.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The old writeback PD controller could get into states where it had throttled all
the way down and take way too long to recover - it was too complicated to really
understand what it was doing.

This rewrites a good chunk of it to hopefully be simpler and make more sense,
and it also pays more attention to units which should make the behaviour a bit
easier to understand.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: bugfix for race between moving_gc and bucket_invalidate</title>
<updated>2013-12-16T22:22:58+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-12-16T22:12:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6d3d1a9c542b19dff1c7d7c8354d0869e4655287'/>
<id>6d3d1a9c542b19dff1c7d7c8354d0869e4655287</id>
<content type='text'>
There is a possibility for a bucket to be invalidated by the allocator
while moving_gc was copying it's contents to another bucket, if the
bucket only held cached data. To prevent this moving checks for
a stale ptr (to an invalidated bucket), before and after reads.
It it finds one, it simply ignores moving that data. This only
affects bcache if the moving_gc was turned on, note that it's
off by default.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a possibility for a bucket to be invalidated by the allocator
while moving_gc was copying it's contents to another bucket, if the
bucket only held cached data. To prevent this moving checks for
a stale ptr (to an invalidated bucket), before and after reads.
It it finds one, it simply ignores moving that data. This only
affects bcache if the moving_gc was turned on, note that it's
off by default.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: fix for gc and writeback race</title>
<updated>2013-12-16T22:22:58+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-11-27T03:14:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bf0a628a95dba7f983b6047cea695fb066fb2512'/>
<id>bf0a628a95dba7f983b6047cea695fb066fb2512</id>
<content type='text'>
Garbage collector needs to check keys in the writeback keybuf to
make sure it's not invalidating buckets to which the writeback
keys point to.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Garbage collector needs to check keys in the writeback keybuf to
make sure it's not invalidating buckets to which the writeback
keys point to.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: bugfix - moving_gc now moves only correct buckets</title>
<updated>2013-12-16T22:22:58+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-11-08T01:53:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=981aa8c091e164ea51dd1e81b71a1f3852bbcceb'/>
<id>981aa8c091e164ea51dd1e81b71a1f3852bbcceb</id>
<content type='text'>
Removed gc_move_threshold because picking buckets only by
threshold could lead moving extra buckets (ei. if there are
buckets at the threshold that aren't supposed to be moved
do to space considerations).

This is replaced by a GC_MOVE bit in the gc_mark bitmask.
Now only marked buckets get moved.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Removed gc_move_threshold because picking buckets only by
threshold could lead moving extra buckets (ei. if there are
buckets at the threshold that aren't supposed to be moved
do to space considerations).

This is replaced by a GC_MOVE bit in the gc_mark bitmask.
Now only marked buckets get moved.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: fix for gc crashing when no sectors are used</title>
<updated>2013-12-16T22:22:57+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-11-01T02:25:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bee63f40cb5f5e8ab2abfbc85acde99cc0acd4b5'/>
<id>bee63f40cb5f5e8ab2abfbc85acde99cc0acd4b5</id>
<content type='text'>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Fix heap_peek() macro</title>
<updated>2013-12-16T22:22:57+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-10-24T00:35:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=97d11a660fd906dbea3dccd2638495d8497c3c81'/>
<id>97d11a660fd906dbea3dccd2638495d8497c3c81</id>
<content type='text'>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Fix for can_attach_cache()</title>
<updated>2013-12-16T22:22:57+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2013-10-22T20:19:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9eb8ebeb2471b8cbaa078066aeb85fe5d46f5304'/>
<id>9eb8ebeb2471b8cbaa078066aeb85fe5d46f5304</id>
<content type='text'>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Fix dirty_data accounting</title>
<updated>2013-12-16T22:22:16+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2013-11-11T05:55:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d24a6e1087030b6da286df9433add5fa2f21b83b'/>
<id>d24a6e1087030b6da286df9433add5fa2f21b83b</id>
<content type='text'>
Dirty data accounting wasn't quite right - firstly, we were adding the key we're
inserting after it could have merged with another dirty key already in the
btree, and secondly we could sometimes pass the wrong offset to
bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is
important when tracking dirty data by stripe.

NOTE FOR BACKPORTERS: For 3.10 (and 3.11?) there's other accounting fixes
necessary that got squashed in with other patches; the full patch against 3.10
is 408cc2f47eeac93a, available at:
  git://evilpiepirate.org/~kent/linux-bcache.git bcache-3.10-writeback-fixes

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Cc: linux-stable &lt;stable@vger.kernel.org&gt; # &gt;= v3.10

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 2a46036..4a12b2f 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1817,7 +1817,8 @@ static bool fix_overlapping_extents(struct btree *b, struct bkey *insert,
 			if (KEY_START(k) &gt; KEY_START(insert) + sectors_found)
 				goto check_failed;

-			if (KEY_PTRS(replace_key) != KEY_PTRS(k))
+			if (KEY_PTRS(k) != KEY_PTRS(replace_key) ||
+			    KEY_DIRTY(k) != KEY_DIRTY(replace_key))
 				goto check_failed;

 			/* skip past gen */
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Dirty data accounting wasn't quite right - firstly, we were adding the key we're
inserting after it could have merged with another dirty key already in the
btree, and secondly we could sometimes pass the wrong offset to
bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is
important when tracking dirty data by stripe.

NOTE FOR BACKPORTERS: For 3.10 (and 3.11?) there's other accounting fixes
necessary that got squashed in with other patches; the full patch against 3.10
is 408cc2f47eeac93a, available at:
  git://evilpiepirate.org/~kent/linux-bcache.git bcache-3.10-writeback-fixes

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
Cc: linux-stable &lt;stable@vger.kernel.org&gt; # &gt;= v3.10

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 2a46036..4a12b2f 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1817,7 +1817,8 @@ static bool fix_overlapping_extents(struct btree *b, struct bkey *insert,
 			if (KEY_START(k) &gt; KEY_START(insert) + sectors_found)
 				goto check_failed;

-			if (KEY_PTRS(replace_key) != KEY_PTRS(k))
+			if (KEY_PTRS(k) != KEY_PTRS(replace_key) ||
+			    KEY_DIRTY(k) != KEY_DIRTY(replace_key))
 				goto check_failed;

 			/* skip past gen */
</pre>
</div>
</content>
</entry>
</feed>
