<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/md/dm.c, branch v5.1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>dm: disable DISCARD if the underlying storage no longer supports it</title>
<updated>2019-04-04T19:33:59+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2019-04-03T16:23:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bcb44433bba5eaff293888ef22ffa07f1f0347d6'/>
<id>bcb44433bba5eaff293888ef22ffa07f1f0347d6</id>
<content type='text'>
Storage devices which report supporting discard commands like
WRITE_SAME_16 with unmap, but reject discard commands sent to the
storage device.  This is a clear storage firmware bug but it doesn't
change the fact that should a program cause discards to be sent to a
multipath device layered on this buggy storage, all paths can end up
failed at the same time from the discards, causing possible I/O loss.

The first discard to a path will fail with Illegal Request, Invalid
field in cdb, e.g.:
 kernel: sd 8:0:8:19: [sdfn] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
 kernel: sd 8:0:8:19: [sdfn] tag#0 Sense Key : Illegal Request [current]
 kernel: sd 8:0:8:19: [sdfn] tag#0 Add. Sense: Invalid field in cdb
 kernel: sd 8:0:8:19: [sdfn] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 a0 08 00 00 00 80 00 00 00
 kernel: blk_update_request: critical target error, dev sdfn, sector 10487808

The SCSI layer converts this to the BLK_STS_TARGET error number, the sd
device disables its support for discard on this path, and because of the
BLK_STS_TARGET error multipath fails the discard without failing any
path or retrying down a different path.  But subsequent discards can
cause path failures.  Any discards sent to the path which already failed
a discard ends up failing with EIO from blk_cloned_rq_check_limits with
an "over max size limit" error since the discard limit was set to 0 by
the sd driver for the path.  As the error is EIO, this now fails the
path and multipath tries to send the discard down the next path.  This
cycle continues as discards are sent until all paths fail.

Fix this by training DM core to disable DISCARD if the underlying
storage already did so.

Also, fix branching in dm_done() and clone_endio() to reflect the
mutually exclussive nature of the IO operations in question.

Cc: stable@vger.kernel.org
Reported-by: David Jeffery &lt;djeffery@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Storage devices which report supporting discard commands like
WRITE_SAME_16 with unmap, but reject discard commands sent to the
storage device.  This is a clear storage firmware bug but it doesn't
change the fact that should a program cause discards to be sent to a
multipath device layered on this buggy storage, all paths can end up
failed at the same time from the discards, causing possible I/O loss.

The first discard to a path will fail with Illegal Request, Invalid
field in cdb, e.g.:
 kernel: sd 8:0:8:19: [sdfn] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
 kernel: sd 8:0:8:19: [sdfn] tag#0 Sense Key : Illegal Request [current]
 kernel: sd 8:0:8:19: [sdfn] tag#0 Add. Sense: Invalid field in cdb
 kernel: sd 8:0:8:19: [sdfn] tag#0 CDB: Write same(16) 93 08 00 00 00 00 00 a0 08 00 00 00 80 00 00 00
 kernel: blk_update_request: critical target error, dev sdfn, sector 10487808

The SCSI layer converts this to the BLK_STS_TARGET error number, the sd
device disables its support for discard on this path, and because of the
BLK_STS_TARGET error multipath fails the discard without failing any
path or retrying down a different path.  But subsequent discards can
cause path failures.  Any discards sent to the path which already failed
a discard ends up failing with EIO from blk_cloned_rq_check_limits with
an "over max size limit" error since the discard limit was set to 0 by
the sd driver for the path.  As the error is EIO, this now fails the
path and multipath tries to send the discard down the next path.  This
cycle continues as discards are sent until all paths fail.

Fix this by training DM core to disable DISCARD if the underlying
storage already did so.

Also, fix branching in dm_done() and clone_endio() to reflect the
mutually exclussive nature of the IO operations in question.

Cc: stable@vger.kernel.org
Reported-by: David Jeffery &lt;djeffery@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: revert 8f50e358153d ("dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE")</title>
<updated>2019-04-01T20:20:36+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2019-03-21T20:46:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=75ae193626de3238ca5fb895868ec91c94e63b1b'/>
<id>75ae193626de3238ca5fb895868ec91c94e63b1b</id>
<content type='text'>
The limit was already incorporated to dm-crypt with commit 4e870e948fba
("dm crypt: fix error with too large bios"), so we don't need to apply
it globally to all targets. The quantity BIO_MAX_PAGES * PAGE_SIZE is
wrong anyway because the variable ti-&gt;max_io_len it is supposed to be in
the units of 512-byte sectors not in bytes.

Reduction of the limit to 1048576 sectors could even cause data
corruption in rare cases - suppose that we have a dm-striped device with
stripe size 768MiB. The target will call dm_set_target_max_io_len with
the value 1572864. The buggy code would reduce it to 1048576. Now, the
dm-core will errorneously split the bios on 1048576-sector boundary
insetad of 1572864-sector boundary and pass these stripe-crossing bios
to the striped target.

Cc: stable@vger.kernel.org # v4.16+
Fixes: 8f50e358153d ("dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE")
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Acked-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The limit was already incorporated to dm-crypt with commit 4e870e948fba
("dm crypt: fix error with too large bios"), so we don't need to apply
it globally to all targets. The quantity BIO_MAX_PAGES * PAGE_SIZE is
wrong anyway because the variable ti-&gt;max_io_len it is supposed to be in
the units of 512-byte sectors not in bytes.

Reduction of the limit to 1048576 sectors could even cause data
corruption in rare cases - suppose that we have a dm-striped device with
stripe size 768MiB. The target will call dm_set_target_max_io_len with
the value 1572864. The buggy code would reduce it to 1048576. Now, the
dm-core will errorneously split the bios on 1048576-sector boundary
insetad of 1572864-sector boundary and pass these stripe-crossing bios
to the striped target.

Cc: stable@vger.kernel.org # v4.16+
Fixes: 8f50e358153d ("dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE")
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Acked-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: always call blk_queue_split() in dm_process_bio()</title>
<updated>2019-03-05T19:53:32+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2019-02-22T14:52:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=effd58c95f277744f75d6e08819ac859dbcbd351'/>
<id>effd58c95f277744f75d6e08819ac859dbcbd351</id>
<content type='text'>
Do not just call blk_queue_split() if the bio is_abnormal_io().

Fixes: 568c73a355e ("dm: update dm_process_bio() to split bio if in -&gt;make_request_fn()")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Do not just call blk_queue_split() if the bio is_abnormal_io().

Fixes: 568c73a355e ("dm: update dm_process_bio() to split bio if in -&gt;make_request_fn()")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: remove unused _rq_tio_cache and _rq_cache</title>
<updated>2019-03-05T19:48:50+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2019-02-20T20:37:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e689fbab3ddd92557134ef92c40a780a33299d05'/>
<id>e689fbab3ddd92557134ef92c40a780a33299d05</id>
<content type='text'>
Also move dm_rq_target_io structure definition from dm-rq.h to dm-rq.c

Fixes: 6a23e05c2fe3c6 ("dm: remove legacy request-based IO path")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also move dm_rq_target_io structure definition from dm-rq.h to dm-rq.c

Fixes: 6a23e05c2fe3c6 ("dm: remove legacy request-based IO path")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: eliminate 'split_discard_bios' flag from DM target interface</title>
<updated>2019-02-21T04:24:55+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2019-01-18T19:19:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=61697a6abd24acba941359c6268a94f4afe4a53d'/>
<id>61697a6abd24acba941359c6268a94f4afe4a53d</id>
<content type='text'>
There is no need to have DM core split discards on behalf of a DM target
now that blk_queue_split() handles splitting discards based on the
queue_limits.  A DM target just needs to set max_discard_sectors,
discard_granularity, etc, in queue_limits.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no need to have DM core split discards on behalf of a DM target
now that blk_queue_split() handles splitting discards based on the
queue_limits.  A DM target just needs to set max_discard_sectors,
discard_granularity, etc, in queue_limits.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: update dm_process_bio() to split bio if in -&gt;make_request_fn()</title>
<updated>2019-02-19T17:25:00+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2019-01-18T19:10:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=568c73a355e0b845dc983aa59c8a8dc69294b275'/>
<id>568c73a355e0b845dc983aa59c8a8dc69294b275</id>
<content type='text'>
Must call blk_queue_split() otherwise queue_limits for abnormal requests
(e.g. discard, writesame, etc) won't be imposed.

In addition, add dm_queue_split() to simplify DM specific splitting that
is needed for targets that impose ti-&gt;max_io_len.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Must call blk_queue_split() otherwise queue_limits for abnormal requests
(e.g. discard, writesame, etc) won't be imposed.

In addition, add dm_queue_split() to simplify DM specific splitting that
is needed for targets that impose ti-&gt;max_io_len.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: don't use bio_trim() afterall</title>
<updated>2019-02-06T22:24:37+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2019-02-05T22:07:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fa8db4948f5224dae33a0e783e7dec682e145f88'/>
<id>fa8db4948f5224dae33a0e783e7dec682e145f88</id>
<content type='text'>
bio_trim() has an early return, which makes it _not_ idempotent, if the
offset is 0 and the bio's bi_size already matches the requested size.
Prior to DM, all users of bio_trim() were fine with this.  But DM has
exposed the fact that bio_trim()'s early return is incompatible with a
cloned bio whose integrity payload must be trimmed via
bio_integrity_trim().

Fix this by reverting DM back to doing the equivalent of bio_trim() but
in an idempotent manner (so bio_integrity_trim is always performed).

Follow-on work is needed to assess what benefit bio_trim()'s early
return is providing to its existing callers.

Reported-by: Milan Broz &lt;gmazyland@gmail.com&gt;
Fixes: 57c36519e4b94 ("dm: fix clone_bio() to trigger blk_recount_segments()")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
bio_trim() has an early return, which makes it _not_ idempotent, if the
offset is 0 and the bio's bi_size already matches the requested size.
Prior to DM, all users of bio_trim() were fine with this.  But DM has
exposed the fact that bio_trim()'s early return is incompatible with a
cloned bio whose integrity payload must be trimmed via
bio_integrity_trim().

Fix this by reverting DM back to doing the equivalent of bio_trim() but
in an idempotent manner (so bio_integrity_trim is always performed).

Follow-on work is needed to assess what benefit bio_trim()'s early
return is providing to its existing callers.

Reported-by: Milan Broz &lt;gmazyland@gmail.com&gt;
Fixes: 57c36519e4b94 ("dm: fix clone_bio() to trigger blk_recount_segments()")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: add memory barrier before waitqueue_active</title>
<updated>2019-02-06T22:24:37+00:00</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2019-02-05T10:09:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=645efa84f6c7566ea863ed37a8b3247247f72e02'/>
<id>645efa84f6c7566ea863ed37a8b3247247f72e02</id>
<content type='text'>
Block core changes to switch bio-based IO accounting to be percpu had a
side-effect of altering DM core to now rely on calling waitqueue_active
(in both bio-based and request-based) to check if another task is in
dm_wait_for_completion().

A memory barrier is needed before calling waitqueue_active().  DM core
doesn't piggyback on a preceding memory barrier so it must explicitly
use its own.

For more details on why using waitqueue_active() without a preceding
barrier is unsafe, please see the comment before the waitqueue_active()
definition in include/linux/wait.h.

Add the missing memory barrier by switching to using wq_has_sleeper().

Fixes: 6f75723190d8 ("dm: remove the pending IO accounting")
Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion")
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Block core changes to switch bio-based IO accounting to be percpu had a
side-effect of altering DM core to now rely on calling waitqueue_active
(in both bio-based and request-based) to check if another task is in
dm_wait_for_completion().

A memory barrier is needed before calling waitqueue_active().  DM core
doesn't piggyback on a preceding memory barrier so it must explicitly
use its own.

For more details on why using waitqueue_active() without a preceding
barrier is unsafe, please see the comment before the waitqueue_active()
definition in include/linux/wait.h.

Add the missing memory barrier by switching to using wq_has_sleeper().

Fixes: 6f75723190d8 ("dm: remove the pending IO accounting")
Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion")
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: add missing trace_block_split() to __split_and_process_bio()</title>
<updated>2019-01-22T18:41:48+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2019-01-18T06:21:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=075c18c3e124a1511ebc10a89f1858c8a77dcb01'/>
<id>075c18c3e124a1511ebc10a89f1858c8a77dcb01</id>
<content type='text'>
Provides useful context about bio splits in blktrace.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Provides useful context about bio splits in blktrace.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: fix dm_wq_work() to only use __split_and_process_bio() if appropriate</title>
<updated>2019-01-22T18:41:40+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2019-01-17T19:33:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6548c7c538e5658cbce686c2dd1a9b4f5398bf34'/>
<id>6548c7c538e5658cbce686c2dd1a9b4f5398bf34</id>
<content type='text'>
Otherwise targets that don't support/expect IO splitting could resubmit
bios using code paths with unnecessary IO splitting complexity.

Depends-on: 24113d487843 ("dm: avoid indirect call in __dm_make_request")
Fixes: 978e51ba38e00 ("dm: optimize bio-based NVMe IO submission")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Otherwise targets that don't support/expect IO splitting could resubmit
bios using code paths with unnecessary IO splitting complexity.

Depends-on: 24113d487843 ("dm: avoid indirect call in __dm_make_request")
Fixes: 978e51ba38e00 ("dm: optimize bio-based NVMe IO submission")
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
