<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/bcache, branch v5.2.19</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>closures: fix a race on wakeup from closure_sync</title>
<updated>2019-10-05T11:13:53+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@gmail.com</email>
</author>
<published>2019-09-03T13:25:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=94a379db627aad261074b3f7376e68041081b232'/>
<id>94a379db627aad261074b3f7376e68041081b232</id>
<content type='text'>
[ Upstream commit a22a9602b88fabf10847f238ff81fde5f906fef7 ]

The race was when a thread using closure_sync() notices cl-&gt;s-&gt;done == 1
before the thread calling closure_put() calls wake_up_process(). Then,
it's possible for that thread to return and exit just before
wake_up_process() is called - so we're trying to wake up a process that
no longer exists.

rcu_read_lock() is sufficient to protect against this, as there's an rcu
barrier somewhere in the process teardown path.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit a22a9602b88fabf10847f238ff81fde5f906fef7 ]

The race was when a thread using closure_sync() notices cl-&gt;s-&gt;done == 1
before the thread calling closure_put() calls wake_up_process(). Then,
it's possible for that thread to return and exit just before
wake_up_process() is called - so we're trying to wake up a process that
no longer exists.

rcu_read_lock() is sufficient to protect against this, as there's an rcu
barrier somewhere in the process teardown path.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: fix race in btree_flush_write()</title>
<updated>2019-09-16T06:23:21+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=16d2d609ff0f1aded31913e4ff887007961085f8'/>
<id>16d2d609ff0f1aded31913e4ff887007961085f8</id>
<content type='text'>
There is a race between mca_reap(), btree_node_free() and journal code
btree_flush_write(), which results very rare and strange deadlock or
panic and are very hard to reproduce.

Let me explain how the race happens. In btree_flush_write() one btree
node with oldest journal pin is selected, then it is flushed to cache
device, the select-and-flush is a two steps operation. Between these two
steps, there are something may happen inside the race window,
- The selected btree node was reaped by mca_reap() and allocated to
  other requesters for other btree node.
- The slected btree node was selected, flushed and released by mca
  shrink callback bch_mca_scan().
When btree_flush_write() tries to flush the selected btree node, firstly
b-&gt;write_lock is held by mutex_lock(). If the race happens and the
memory of selected btree node is allocated to other btree node, if that
btree node's write_lock is held already, a deadlock very probably
happens here. A worse case is the memory of the selected btree node is
released, then all references to this btree node (e.g. b-&gt;write_lock)
will trigger NULL pointer deference panic.

This race was introduced in commit cafe56359144 ("bcache: A block layer
cache"), and enlarged by commit c4dc2497d50d ("bcache: fix high CPU
occupancy during journal"), which selected 128 btree nodes and flushed
them one-by-one in a quite long time period.

Such race is not easy to reproduce before. On a Lenovo SR650 server with
48 Xeon cores, and configure 1 NVMe SSD as cache device, a MD raid0
device assembled by 3 NVMe SSDs as backing device, this race can be
observed around every 10,000 times btree_flush_write() gets called. Both
deadlock and kernel panic all happened as aftermath of the race.

The idea of the fix is to add a btree flag BTREE_NODE_journal_flush. It
is set when selecting btree nodes, and cleared after btree nodes
flushed. Then when mca_reap() selects a btree node with this bit set,
this btree node will be skipped. Since mca_reap() only reaps btree node
without BTREE_NODE_journal_flush flag, such race is avoided.

Once corner case should be noticed, that is btree_node_free(). It might
be called in some error handling code path. For example the following
code piece from btree_split(),
        2149 err_free2:
        2150         bkey_put(b-&gt;c, &amp;n2-&gt;key);
        2151         btree_node_free(n2);
        2152         rw_unlock(true, n2);
        2153 err_free1:
        2154         bkey_put(b-&gt;c, &amp;n1-&gt;key);
        2155         btree_node_free(n1);
        2156         rw_unlock(true, n1);
At line 2151 and 2155, the btree node n2 and n1 are released without
mac_reap(), so BTREE_NODE_journal_flush also needs to be checked here.
If btree_node_free() is called directly in such error handling path,
and the selected btree node has BTREE_NODE_journal_flush bit set, just
delay for 1 us and retry again. In this case this btree node won't
be skipped, just retry until the BTREE_NODE_journal_flush bit cleared,
and free the btree node memory.

Fixes: cafe56359144 ("bcache: A block layer cache")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reported-and-tested-by: kbuild test robot &lt;lkp@intel.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a race between mca_reap(), btree_node_free() and journal code
btree_flush_write(), which results very rare and strange deadlock or
panic and are very hard to reproduce.

Let me explain how the race happens. In btree_flush_write() one btree
node with oldest journal pin is selected, then it is flushed to cache
device, the select-and-flush is a two steps operation. Between these two
steps, there are something may happen inside the race window,
- The selected btree node was reaped by mca_reap() and allocated to
  other requesters for other btree node.
- The slected btree node was selected, flushed and released by mca
  shrink callback bch_mca_scan().
When btree_flush_write() tries to flush the selected btree node, firstly
b-&gt;write_lock is held by mutex_lock(). If the race happens and the
memory of selected btree node is allocated to other btree node, if that
btree node's write_lock is held already, a deadlock very probably
happens here. A worse case is the memory of the selected btree node is
released, then all references to this btree node (e.g. b-&gt;write_lock)
will trigger NULL pointer deference panic.

This race was introduced in commit cafe56359144 ("bcache: A block layer
cache"), and enlarged by commit c4dc2497d50d ("bcache: fix high CPU
occupancy during journal"), which selected 128 btree nodes and flushed
them one-by-one in a quite long time period.

Such race is not easy to reproduce before. On a Lenovo SR650 server with
48 Xeon cores, and configure 1 NVMe SSD as cache device, a MD raid0
device assembled by 3 NVMe SSDs as backing device, this race can be
observed around every 10,000 times btree_flush_write() gets called. Both
deadlock and kernel panic all happened as aftermath of the race.

The idea of the fix is to add a btree flag BTREE_NODE_journal_flush. It
is set when selecting btree nodes, and cleared after btree nodes
flushed. Then when mca_reap() selects a btree node with this bit set,
this btree node will be skipped. Since mca_reap() only reaps btree node
without BTREE_NODE_journal_flush flag, such race is avoided.

Once corner case should be noticed, that is btree_node_free(). It might
be called in some error handling code path. For example the following
code piece from btree_split(),
        2149 err_free2:
        2150         bkey_put(b-&gt;c, &amp;n2-&gt;key);
        2151         btree_node_free(n2);
        2152         rw_unlock(true, n2);
        2153 err_free1:
        2154         bkey_put(b-&gt;c, &amp;n1-&gt;key);
        2155         btree_node_free(n1);
        2156         rw_unlock(true, n1);
At line 2151 and 2155, the btree node n2 and n1 are released without
mac_reap(), so BTREE_NODE_journal_flush also needs to be checked here.
If btree_node_free() is called directly in such error handling path,
and the selected btree node has BTREE_NODE_journal_flush bit set, just
delay for 1 us and retry again. In this case this btree node won't
be skipped, just retry until the BTREE_NODE_journal_flush bit cleared,
and free the btree node memory.

Fixes: cafe56359144 ("bcache: A block layer cache")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Reported-and-tested-by: kbuild test robot &lt;lkp@intel.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: add comments for mutex_lock(&amp;b-&gt;write_lock)</title>
<updated>2019-09-16T06:23:21+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9138558777944599716f90163b8d2b192cfe59cf'/>
<id>9138558777944599716f90163b8d2b192cfe59cf</id>
<content type='text'>
When accessing or modifying BTREE_NODE_dirty bit, it is not always
necessary to acquire b-&gt;write_lock. In bch_btree_cache_free() and
mca_reap() acquiring b-&gt;write_lock is necessary, and this patch adds
comments to explain why mutex_lock(&amp;b-&gt;write_lock) is necessary for
checking or clearing BTREE_NODE_dirty bit there.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When accessing or modifying BTREE_NODE_dirty bit, it is not always
necessary to acquire b-&gt;write_lock. In bch_btree_cache_free() and
mca_reap() acquiring b-&gt;write_lock is necessary, and this patch adds
comments to explain why mutex_lock(&amp;b-&gt;write_lock) is necessary for
checking or clearing BTREE_NODE_dirty bit there.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: only clear BTREE_NODE_dirty bit when it is set</title>
<updated>2019-09-16T06:23:20+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b622ba2bcd4edf70a307db16e94f2292563bdc8d'/>
<id>b622ba2bcd4edf70a307db16e94f2292563bdc8d</id>
<content type='text'>
In bch_btree_cache_free() and btree_node_free(), BTREE_NODE_dirty is
always set no matter btree node is dirty or not. The code looks like
this,
	if (btree_node_dirty(b))
		btree_complete_write(b, btree_current_write(b));
	clear_bit(BTREE_NODE_dirty, &amp;b-&gt;flags);

Indeed if btree_node_dirty(b) returns false, it means BTREE_NODE_dirty
bit is cleared, then it is unnecessary to clear the bit again.

This patch only clears BTREE_NODE_dirty when btree_node_dirty(b) is
true (the bit is set), to save a few CPU cycles.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In bch_btree_cache_free() and btree_node_free(), BTREE_NODE_dirty is
always set no matter btree node is dirty or not. The code looks like
this,
	if (btree_node_dirty(b))
		btree_complete_write(b, btree_current_write(b));
	clear_bit(BTREE_NODE_dirty, &amp;b-&gt;flags);

Indeed if btree_node_dirty(b) returns false, it means BTREE_NODE_dirty
bit is cleared, then it is unnecessary to clear the bit again.

This patch only clears BTREE_NODE_dirty when btree_node_dirty(b) is
true (the bit is set), to save a few CPU cycles.

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: destroy dc-&gt;writeback_write_wq if failed to create dc-&gt;writeback_thread</title>
<updated>2019-07-26T07:10:52+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=26b8adf883f43db0c4bba7af860a0ff6ebdb59ba'/>
<id>26b8adf883f43db0c4bba7af860a0ff6ebdb59ba</id>
<content type='text'>
commit f54d801dda14942dbefa00541d10603015b7859c upstream.

Commit 9baf30972b55 ("bcache: fix for gc and write-back race") added a
new work queue dc-&gt;writeback_write_wq, but forgot to destroy it in the
error condition when creating dc-&gt;writeback_thread failed.

This patch destroys dc-&gt;writeback_write_wq if kthread_create() returns
error pointer to dc-&gt;writeback_thread, then a memory leak is avoided.

Fixes: 9baf30972b55 ("bcache: fix for gc and write-back race")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f54d801dda14942dbefa00541d10603015b7859c upstream.

Commit 9baf30972b55 ("bcache: fix for gc and write-back race") added a
new work queue dc-&gt;writeback_write_wq, but forgot to destroy it in the
error condition when creating dc-&gt;writeback_thread failed.

This patch destroys dc-&gt;writeback_write_wq if kthread_create() returns
error pointer to dc-&gt;writeback_thread, then a memory leak is avoided.

Fixes: 9baf30972b55 ("bcache: fix for gc and write-back race")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: fix mistaken sysfs entry for io_error counter</title>
<updated>2019-07-26T07:10:52+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8b52937e010fceba8be11242e0eaea0af6143080'/>
<id>8b52937e010fceba8be11242e0eaea0af6143080</id>
<content type='text'>
commit 5461999848e0462c14f306a62923d22de820a59c upstream.

In bch_cached_dev_files[] from driver/md/bcache/sysfs.c, sysfs_errors is
incorrectly inserted in. The correct entry should be sysfs_io_errors.

This patch fixes the problem and now I/O errors of cached device can be
read from /sys/block/bcache&lt;N&gt;/bcache/io_errors.

Fixes: c7b7bd07404c5 ("bcache: add io_disable to struct cached_dev")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 5461999848e0462c14f306a62923d22de820a59c upstream.

In bch_cached_dev_files[] from driver/md/bcache/sysfs.c, sysfs_errors is
incorrectly inserted in. The correct entry should be sysfs_io_errors.

This patch fixes the problem and now I/O errors of cached device can be
read from /sys/block/bcache&lt;N&gt;/bcache/io_errors.

Fixes: c7b7bd07404c5 ("bcache: add io_disable to struct cached_dev")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: ignore read-ahead request failure on backing device</title>
<updated>2019-07-26T07:10:52+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=735c54c3cecf40a7d81adcf6d9a8613925b6491d'/>
<id>735c54c3cecf40a7d81adcf6d9a8613925b6491d</id>
<content type='text'>
commit 578df99b1b0531d19af956530fe4da63d01a1604 upstream.

When md raid device (e.g. raid456) is used as backing device, read-ahead
requests on a degrading and recovering md raid device might be failured
immediately by md raid code, but indeed this md raid array can still be
read or write for normal I/O requests. Therefore such failed read-ahead
request are not real hardware failure. Further more, after degrading and
recovering accomplished, read-ahead requests will be handled by md raid
array again.

For such condition, I/O failures of read-ahead requests don't indicate
real health status (because normal I/O still be served), they should not
be counted into I/O error counter dc-&gt;io_errors.

Since there is no simple way to detect whether the backing divice is a
md raid device, this patch simply ignores I/O failures for read-ahead
bios on backing device, to avoid bogus backing device failure on a
degrading md raid array.

Suggested-and-tested-by: Thorsten Knabe &lt;linux@thorsten-knabe.de&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 578df99b1b0531d19af956530fe4da63d01a1604 upstream.

When md raid device (e.g. raid456) is used as backing device, read-ahead
requests on a degrading and recovering md raid device might be failured
immediately by md raid code, but indeed this md raid array can still be
read or write for normal I/O requests. Therefore such failed read-ahead
request are not real hardware failure. Further more, after degrading and
recovering accomplished, read-ahead requests will be handled by md raid
array again.

For such condition, I/O failures of read-ahead requests don't indicate
real health status (because normal I/O still be served), they should not
be counted into I/O error counter dc-&gt;io_errors.

Since there is no simple way to detect whether the backing divice is a
md raid device, this patch simply ignores I/O failures for read-ahead
bios on backing device, to avoid bogus backing device failure on a
degrading md raid array.

Suggested-and-tested-by: Thorsten Knabe &lt;linux@thorsten-knabe.de&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Revert "bcache: free heap cache_set-&gt;flush_btree in bch_journal_free"</title>
<updated>2019-07-26T07:10:52+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e294fa562fadad697cd49e2d7883b1b95a8aa230'/>
<id>e294fa562fadad697cd49e2d7883b1b95a8aa230</id>
<content type='text'>
commit ba82c1ac1667d6efb91a268edb13fc9cdaecec9b upstream.

This reverts commit 6268dc2c4703aabfb0b35681be709acf4c2826c6.

This patch depends on commit c4dc2497d50d ("bcache: fix high CPU
occupancy during journal") which is reverted in previous patch. So
revert this one too.

Fixes: 6268dc2c4703 ("bcache: free heap cache_set-&gt;flush_btree in bch_journal_free")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Cc: Shenghui Wang &lt;shhuiw@foxmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit ba82c1ac1667d6efb91a268edb13fc9cdaecec9b upstream.

This reverts commit 6268dc2c4703aabfb0b35681be709acf4c2826c6.

This patch depends on commit c4dc2497d50d ("bcache: fix high CPU
occupancy during journal") which is reverted in previous patch. So
revert this one too.

Fixes: 6268dc2c4703 ("bcache: free heap cache_set-&gt;flush_btree in bch_journal_free")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Cc: Shenghui Wang &lt;shhuiw@foxmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Revert "bcache: fix high CPU occupancy during journal"</title>
<updated>2019-07-26T07:10:52+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fe4a2764fd85185f9014beecc68510d1f9bdc958'/>
<id>fe4a2764fd85185f9014beecc68510d1f9bdc958</id>
<content type='text'>
commit 249a5f6da57c28a903c75d81505d58ec8c10030d upstream.

This reverts commit c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0.

This patch enlarges a race between normal btree flush code path and
flush_btree_write(), which causes deadlock when journal space is
exhausted. Reverts this patch makes the race window from 128 btree
nodes to only 1 btree nodes.

Fixes: c4dc2497d50d ("bcache: fix high CPU occupancy during journal")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Cc: Tang Junhui &lt;tang.junhui.linux@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 249a5f6da57c28a903c75d81505d58ec8c10030d upstream.

This reverts commit c4dc2497d50d9c6fb16aa0d07b6a14f3b2adb1e0.

This patch enlarges a race between normal btree flush code path and
flush_btree_write(), which causes deadlock when journal space is
exhausted. Reverts this patch makes the race window from 128 btree
nodes to only 1 btree nodes.

Fixes: c4dc2497d50d ("bcache: fix high CPU occupancy during journal")
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Cc: Tang Junhui &lt;tang.junhui.linux@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()"</title>
<updated>2019-07-26T07:10:51+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2019-06-28T11:59:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3244e913f1061b3e0b412ea8b664706eb6a7591a'/>
<id>3244e913f1061b3e0b412ea8b664706eb6a7591a</id>
<content type='text'>
commit 695277f16b3a102fcc22c97fdf2de77c7b19f0b3 upstream.

This reverts commit 6147305c73e4511ca1a975b766b97a779d442567.

Although this patch helps the failed bcache device to stop faster when
too many I/O errors detected on corresponding cached device, setting
CACHE_SET_IO_DISABLE bit to cache set c-&gt;flags was not a good idea. This
operation will disable all I/Os on cache set, which means other attached
bcache devices won't work neither.

Without this patch, the failed bcache device can also be stopped
eventually if internal I/O accomplished (e.g. writeback). Therefore here
I revert it.

Fixes: 6147305c73e4 ("bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()")
Reported-by: Yong Li &lt;mr.liyong@qq.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 695277f16b3a102fcc22c97fdf2de77c7b19f0b3 upstream.

This reverts commit 6147305c73e4511ca1a975b766b97a779d442567.

Although this patch helps the failed bcache device to stop faster when
too many I/O errors detected on corresponding cached device, setting
CACHE_SET_IO_DISABLE bit to cache set c-&gt;flags was not a good idea. This
operation will disable all I/Os on cache set, which means other attached
bcache devices won't work neither.

Without this patch, the failed bcache device can also be stopped
eventually if internal I/O accomplished (e.g. writeback). Therefore here
I revert it.

Fixes: 6147305c73e4 ("bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()")
Reported-by: Yong Li &lt;mr.liyong@qq.com&gt;
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
</feed>
