<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/md/bcache/request.c, branch v3.16.78</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>bcache: fix wrong cache_misses statistics</title>
<updated>2019-02-11T17:53:29+00:00</updated>
<author>
<name>tang.junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-10-30T21:46:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0736abb37077547bb45e282c9a8040104be478a4'/>
<id>0736abb37077547bb45e282c9a8040104be478a4</id>
<content type='text'>
commit c157313791a999646901b3e3c6888514ebc36d62 upstream.

Currently, Cache missed IOs are identified by s-&gt;cache_miss, but actually,
there are many situations that missed IOs are not assigned a value for
s-&gt;cache_miss in cached_dev_cache_miss(), for example, a bypassed IO
(s-&gt;iop.bypass = 1), or the cache_bio allocate failed. In these situations,
it will go to out_put or out_submit, and s-&gt;cache_miss is null, which leads
bch_mark_cache_accounting() to treat this IO as a hit IO.

[ML: applied by 3-way merge]

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c157313791a999646901b3e3c6888514ebc36d62 upstream.

Currently, Cache missed IOs are identified by s-&gt;cache_miss, but actually,
there are many situations that missed IOs are not assigned a value for
s-&gt;cache_miss in cached_dev_cache_miss(), for example, a bypassed IO
(s-&gt;iop.bypass = 1), or the cache_bio allocate failed. In these situations,
it will go to out_put or out_submit, and s-&gt;cache_miss is null, which leads
bch_mark_cache_accounting() to treat this IO as a hit IO.

[ML: applied by 3-way merge]

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: don't embed 'return' statements in closure macros</title>
<updated>2018-12-16T22:09:26+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-03-06T15:37:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0063e500cf93dbb04976b369b27d2a0b37e1d7cf'/>
<id>0063e500cf93dbb04976b369b27d2a0b37e1d7cf</id>
<content type='text'>
commit 77b5a08427e87514c33730afc18cd02c9475e2c3 upstream.

This is horribly confusing, it breaks the flow of the code without
it being apparent in the caller.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 77b5a08427e87514c33730afc18cd02c9475e2c3 upstream.

This is horribly confusing, it breaks the flow of the code without
it being apparent in the caller.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: recover data from backing when data is clean</title>
<updated>2018-03-03T15:51:33+00:00</updated>
<author>
<name>Rui Hua</name>
<email>huarui.dev@gmail.com</email>
</author>
<published>2017-11-24T23:14:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fd4811d94f3137680e316c7e0d2492a4cd08b7a3'/>
<id>fd4811d94f3137680e316c7e0d2492a4cd08b7a3</id>
<content type='text'>
commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream.

When we send a read request and hit the clean data in cache device, there
is a situation called cache read race in bcache(see the commit in the tail
of cache_look_up(), the following explaination just copy from there):
The bucket we're reading from might be reused while our bio is in flight,
and we could then end up reading the wrong data. We guard against this
by checking (in bch_cache_read_endio()) if the pointer is stale again;
if so, we treat it as an error (s-&gt;iop.error = -EINTR) and reread from
the backing device (but we don't pass that error up anywhere)

It should be noted that cache read race happened under normal
circumstances, not the circumstance when SSD failed, it was counted
and shown in  /sys/fs/bcache/XXX/internal/cache_read_races.

Without this patch, when we use writeback mode, we will never reread from
the backing device when cache read race happened, until the whole cache
device is clean, because the condition
(s-&gt;recoverable &amp;&amp; (dc &amp;&amp; !atomic_read(&amp;dc-&gt;has_dirty))) is false in
cached_dev_read_error(). In this situation, the s-&gt;iop.error(= -EINTR)
will be passed up, at last, user will receive -EINTR when it's bio end,
this is not suitable, and wield to up-application.

In this patch, we use s-&gt;read_dirty_data to judge whether the read
request hit dirty data in cache device, it is safe to reread data from
the backing device when the read request hit clean data. This can not
only handle cache read race, but also recover data when failed read
request from cache device.

[edited by mlyle to fix up whitespace, commit log title, comment
spelling]

Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
Signed-off-by: Hua Rui &lt;huarui.dev@gmail.com&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream.

When we send a read request and hit the clean data in cache device, there
is a situation called cache read race in bcache(see the commit in the tail
of cache_look_up(), the following explaination just copy from there):
The bucket we're reading from might be reused while our bio is in flight,
and we could then end up reading the wrong data. We guard against this
by checking (in bch_cache_read_endio()) if the pointer is stale again;
if so, we treat it as an error (s-&gt;iop.error = -EINTR) and reread from
the backing device (but we don't pass that error up anywhere)

It should be noted that cache read race happened under normal
circumstances, not the circumstance when SSD failed, it was counted
and shown in  /sys/fs/bcache/XXX/internal/cache_read_races.

Without this patch, when we use writeback mode, we will never reread from
the backing device when cache read race happened, until the whole cache
device is clean, because the condition
(s-&gt;recoverable &amp;&amp; (dc &amp;&amp; !atomic_read(&amp;dc-&gt;has_dirty))) is false in
cached_dev_read_error(). In this situation, the s-&gt;iop.error(= -EINTR)
will be passed up, at last, user will receive -EINTR when it's bio end,
this is not suitable, and wield to up-application.

In this patch, we use s-&gt;read_dirty_data to judge whether the read
request hit dirty data in cache device, it is safe to reread data from
the backing device when the read request hit clean data. This can not
only handle cache read race, but also recover data when failed read
request from cache device.

[edited by mlyle to fix up whitespace, commit log title, comment
spelling]

Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
Signed-off-by: Hua Rui &lt;huarui.dev@gmail.com&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: only permit to recovery read error when cache device is clean</title>
<updated>2018-02-13T18:42:11+00:00</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2017-10-30T21:46:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1700644e9036ce3b22bf52a47850372c369ff448'/>
<id>1700644e9036ce3b22bf52a47850372c369ff448</id>
<content type='text'>
commit d59b23795933678c9638fd20c942d2b4f3cd6185 upstream.

When bcache does read I/Os, for example in writeback or writethrough mode,
if a read request on cache device is failed, bcache will try to recovery
the request by reading from cached device. If the data on cached device is
not synced with cache device, then requester will get a stale data.

For critical storage system like database, providing stale data from
recovery may result an application level data corruption, which is
unacceptible.

With this patch, for a failed read request in writeback or writethrough
mode, recovery a recoverable read request only happens when cache device
is clean. That is to say, all data on cached device is up to update.

For other cache modes in bcache, read request will never hit
cached_dev_read_error(), they don't need this patch.

Please note, because cache mode can be switched arbitrarily in run time, a
writethrough mode might be switched from a writeback mode. Therefore
checking dc-&gt;has_data in writethrough mode still makes sense.

Changelog:
V4: Fix parens error pointed by Michael Lyle.
v3: By response from Kent Oversteet, he thinks recovering stale data is a
    bug to fix, and option to permit it is unnecessary. So this version
    the sysfs file is removed.
v2: rename sysfs entry from allow_stale_data_on_failure  to
    allow_stale_data_on_failure, and fix the confusing commit log.
v1: initial patch posted.

[small change to patch comment spelling by mlyle]

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reported-by: Arne Wolf &lt;awolf@lenovo.com&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Nix &lt;nix@esperi.org.uk&gt;
Cc: Kai Krakow &lt;hurikhan77@gmail.com&gt;
Cc: Eric Wheeler &lt;bcache@lists.ewheeler.net&gt;
Cc: Junhui Tang &lt;tang.junhui@zte.com.cn&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d59b23795933678c9638fd20c942d2b4f3cd6185 upstream.

When bcache does read I/Os, for example in writeback or writethrough mode,
if a read request on cache device is failed, bcache will try to recovery
the request by reading from cached device. If the data on cached device is
not synced with cache device, then requester will get a stale data.

For critical storage system like database, providing stale data from
recovery may result an application level data corruption, which is
unacceptible.

With this patch, for a failed read request in writeback or writethrough
mode, recovery a recoverable read request only happens when cache device
is clean. That is to say, all data on cached device is up to update.

For other cache modes in bcache, read request will never hit
cached_dev_read_error(), they don't need this patch.

Please note, because cache mode can be switched arbitrarily in run time, a
writethrough mode might be switched from a writeback mode. Therefore
checking dc-&gt;has_data in writethrough mode still makes sense.

Changelog:
V4: Fix parens error pointed by Michael Lyle.
v3: By response from Kent Oversteet, he thinks recovering stale data is a
    bug to fix, and option to permit it is unnecessary. So this version
    the sysfs file is removed.
v2: rename sysfs entry from allow_stale_data_on_failure  to
    allow_stale_data_on_failure, and fix the confusing commit log.
v1: initial patch posted.

[small change to patch comment spelling by mlyle]

Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reported-by: Arne Wolf &lt;awolf@lenovo.com&gt;
Reviewed-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Cc: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Cc: Nix &lt;nix@esperi.org.uk&gt;
Cc: Kai Krakow &lt;hurikhan77@gmail.com&gt;
Cc: Eric Wheeler &lt;bcache@lists.ewheeler.net&gt;
Cc: Junhui Tang &lt;tang.junhui@zte.com.cn&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: do not subtract sectors_to_gc for bypassed IO</title>
<updated>2017-11-26T13:50:33+00:00</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-09-06T06:25:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f1f2d73bab02e551627fea150ca3edfb1016057c'/>
<id>f1f2d73bab02e551627fea150ca3edfb1016057c</id>
<content type='text'>
commit 69daf03adef5f7bc13e0ac86b4b8007df1767aab upstream.

Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Eric Wheeler &lt;bcache@linux.ewheeler.net&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 69daf03adef5f7bc13e0ac86b4b8007df1767aab upstream.

Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Eric Wheeler &lt;bcache@linux.ewheeler.net&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: fix sequential large write IO bypass</title>
<updated>2017-11-26T13:50:33+00:00</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-09-06T06:25:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d1550aa7f5c100e6269afbf58acbcd3abfb5dcc6'/>
<id>d1550aa7f5c100e6269afbf58acbcd3abfb5dcc6</id>
<content type='text'>
commit c81ffa32a214c84b08900fbc9d432187bd948eba upstream.

Sequential write IOs were tested with bs=1M by FIO in writeback cache
mode, these IOs were expected to be bypassed, but actually they did not.
We debug the code, and find in check_should_bypass():
    if (!congested &amp;&amp;
        mode == CACHE_MODE_WRITEBACK &amp;&amp;
        op_is_write(bio_op(bio)) &amp;&amp;
        (bio-＞bi_opf &amp; REQ_SYNC))
        goto rescale
that means, If in writeback mode, a write IO with REQ_SYNC flag will not
be bypassed though it is a sequential large IO, It's not a correct thing
to do actually, so this patch remove these codes.

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Reviewed-by: Eric Wheeler &lt;bcache@linux.ewheeler.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[bwh: Backported to 3.16: deleted code is slightly different]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c81ffa32a214c84b08900fbc9d432187bd948eba upstream.

Sequential write IOs were tested with bs=1M by FIO in writeback cache
mode, these IOs were expected to be bypassed, but actually they did not.
We debug the code, and find in check_should_bypass():
    if (!congested &amp;&amp;
        mode == CACHE_MODE_WRITEBACK &amp;&amp;
        op_is_write(bio_op(bio)) &amp;&amp;
        (bio-＞bi_opf &amp; REQ_SYNC))
        goto rescale
that means, If in writeback mode, a write IO with REQ_SYNC flag will not
be bypassed though it is a sequential large IO, It's not a correct thing
to do actually, so this patch remove these codes.

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Reviewed-by: Eric Wheeler &lt;bcache@linux.ewheeler.net&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[bwh: Backported to 3.16: deleted code is slightly different]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Kill dead cgroup code</title>
<updated>2014-03-18T19:22:35+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2014-01-23T12:42:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3f5e0a34daed197aa55d0c6b466bb4cd03babb4f'/>
<id>3f5e0a34daed197aa55d0c6b466bb4cd03babb4f</id>
<content type='text'>
This hasn't been used or even enabled in ages.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This hasn't been used or even enabled in ages.

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Fix moving_gc deadlocking with a foreground write</title>
<updated>2014-03-18T19:22:33+00:00</updated>
<author>
<name>Nicholas Swenson</name>
<email>nks@daterainc.com</email>
</author>
<published>2014-01-10T00:03:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=da415a096fc06e49d1a15f7a06bcfe6ad44c5d38'/>
<id>da415a096fc06e49d1a15f7a06bcfe6ad44c5d38</id>
<content type='text'>
Deadlock happened because a foreground write slept, waiting for a bucket
to be allocated. Normally the gc would mark buckets available for invalidation.
But the moving_gc was stuck waiting for outstanding writes to complete.
These writes used the bcache_wq, the same queue foreground writes used.

This fix gives moving_gc its own work queue, so it was still finish moving
even if foreground writes are stuck waiting for allocation. It also makes
work queue a parameter to the data_insert path, so moving_gc can use its
workqueue for writes.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Deadlock happened because a foreground write slept, waiting for a bucket
to be allocated. Normally the gc would mark buckets available for invalidation.
But the moving_gc was stuck waiting for outstanding writes to complete.
These writes used the bcache_wq, the same queue foreground writes used.

This fix gives moving_gc its own work queue, so it was still finish moving
even if foreground writes are stuck waiting for allocation. It also makes
work queue a parameter to the data_insert path, so moving_gc can use its
workqueue for writes.

Signed-off-by: Nicholas Swenson &lt;nks@daterainc.com&gt;
Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcache: Fix flash_dev_cache_miss() for real this time</title>
<updated>2014-02-26T02:41:11+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kmo@daterainc.com</email>
</author>
<published>2014-01-16T23:04:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1b4eaf3d3809a658c85911e92d9ff64086931efa'/>
<id>1b4eaf3d3809a658c85911e92d9ff64086931efa</id>
<content type='text'>
The code was using sectors to count the number of sectors it was zeroing... but
then it passed it to bio_advance()... after it had been set to 0. Amusing...

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The code was using sectors to count the number of sectors it was zeroing... but
then it passed it to bio_advance()... after it had been set to 0. Amusing...

Signed-off-by: Kent Overstreet &lt;kmo@daterainc.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'bcache-for-3.14' of git://evilpiepirate.org/~kent/linux-bcache into for-linus</title>
<updated>2014-01-30T19:57:55+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2014-01-30T19:57:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=96d2e8b5e288e9d2a40b95161b855944846526a5'/>
<id>96d2e8b5e288e9d2a40b95161b855944846526a5</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
