<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/drivers/md/dm-thin.c, branch v4.14</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm</title>
<updated>2017-09-14T20:43:16+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-09-14T20:43:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dff4d1f6fe85627b7ce8e4c5291d8621a1995605'/>
<id>dff4d1f6fe85627b7ce8e4c5291d8621a1995605</id>
<content type='text'>
Pull device mapper updates from Mike Snitzer:

 - Some request-based DM core and DM multipath fixes and cleanups

 - Constify a few variables in DM core and DM integrity

 - Add bufio optimization and checksum failure accounting to DM
   integrity

 - Fix DM integrity to avoid checking integrity of failed reads

 - Fix DM integrity to use init_completion

 - A couple DM log-writes target fixes

 - Simplify DAX flushing by eliminating the unnecessary flush
   abstraction that was stood up for DM's use.

* tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dax: remove the pmem_dax_ops-&gt;flush abstraction
  dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACK
  dm integrity: make blk_integrity_profile structure const
  dm integrity: do not check integrity for failed read operations
  dm log writes: fix &gt;512b sectorsize support
  dm log writes: don't use all the cpu while waiting to log blocks
  dm ioctl: constify ioctl lookup table
  dm: constify argument arrays
  dm integrity: count and display checksum failures
  dm integrity: optimize writing dm-bufio buffers that are partially changed
  dm rq: do not update rq partially in each ending bio
  dm rq: make dm-sq requeuing behavior consistent with dm-mq behavior
  dm mpath: complain about unsupported __multipath_map_bio() return values
  dm mpath: avoid that building with W=1 causes gcc 7 to complain about fall-through
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull device mapper updates from Mike Snitzer:

 - Some request-based DM core and DM multipath fixes and cleanups

 - Constify a few variables in DM core and DM integrity

 - Add bufio optimization and checksum failure accounting to DM
   integrity

 - Fix DM integrity to avoid checking integrity of failed reads

 - Fix DM integrity to use init_completion

 - A couple DM log-writes target fixes

 - Simplify DAX flushing by eliminating the unnecessary flush
   abstraction that was stood up for DM's use.

* tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dax: remove the pmem_dax_ops-&gt;flush abstraction
  dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACK
  dm integrity: make blk_integrity_profile structure const
  dm integrity: do not check integrity for failed read operations
  dm log writes: fix &gt;512b sectorsize support
  dm log writes: don't use all the cpu while waiting to log blocks
  dm ioctl: constify ioctl lookup table
  dm: constify argument arrays
  dm integrity: count and display checksum failures
  dm integrity: optimize writing dm-bufio buffers that are partially changed
  dm rq: do not update rq partially in each ending bio
  dm rq: make dm-sq requeuing behavior consistent with dm-mq behavior
  dm mpath: complain about unsupported __multipath_map_bio() return values
  dm mpath: avoid that building with W=1 causes gcc 7 to complain about fall-through
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: constify argument arrays</title>
<updated>2017-08-28T15:47:18+00:00</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2017-06-22T18:32:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5916a22b83041b07d63191fe06206ae0fff6ec7a'/>
<id>5916a22b83041b07d63191fe06206ae0fff6ec7a</id>
<content type='text'>
The arrays of 'struct dm_arg' are never modified by the device-mapper
core, so constify them so that they are placed in .rodata.

(Exception: the args array in dm-raid cannot be constified because it is
allocated on the stack and modified.)

Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The arrays of 'struct dm_arg' are never modified by the device-mapper
core, so constify them so that they are placed in .rodata.

(Exception: the args array in dm-raid cannot be constified because it is
allocated on the stack and modified.)

Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: replace bi_bdev with a gendisk pointer and partitions index</title>
<updated>2017-08-23T18:49:55+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-08-23T17:10:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=74d46992e0d9dee7f1f376de0d56d31614c8a17a'/>
<id>74d46992e0d9dee7f1f376de0d56d31614c8a17a</id>
<content type='text'>
This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This way we don't need a block_device structure to submit I/O.  The
block_device has different life time rules from the gendisk and
request_queue and is usually only available when the block device node
is open.  Other callers need to explicitly create one (e.g. the lightnvm
passthrough code, or the new nvme multipathing code).

For the actual I/O path all that we need is the gendisk, which exists
once per block device.  But given that the block layer also does
partition remapping we additionally need a partition index, which is
used for said remapping in generic_make_request.

Note that all the block drivers generally want request_queue or
sometimes the gendisk, so this removes a layer of indirection all
over the stack.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-block</title>
<updated>2017-07-03T17:34:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-07-03T17:34:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c6b1e36c8fa04a6680c44fe0321d0370400e90b6'/>
<id>c6b1e36c8fa04a6680c44fe0321d0370400e90b6</id>
<content type='text'>
Pull core block/IO updates from Jens Axboe:
 "This is the main pull request for the block layer for 4.13. Not a huge
  round in terms of features, but there's a lot of churn related to some
  core cleanups.

  Note this depends on the UUID tree pull request, that Christoph
  already sent out.

  This pull request contains:

   - A series from Christoph, unifying the error/stats codes in the
     block layer. We now use blk_status_t everywhere, instead of using
     different schemes for different places.

   - Also from Christoph, some cleanups around request allocation and IO
     scheduler interactions in blk-mq.

   - And yet another series from Christoph, cleaning up how we handle
     and do bounce buffering in the block layer.

   - A blk-mq debugfs series from Bart, further improving on the support
     we have for exporting internal information to aid debugging IO
     hangs or stalls.

   - Also from Bart, a series that cleans up the request initialization
     differences across types of devices.

   - A series from Goldwyn Rodrigues, allowing the block layer to return
     failure if we will block and the user asked for non-blocking.

   - Patch from Hannes for supporting setting loop devices block size to
     that of the underlying device.

   - Two series of patches from Javier, fixing various issues with
     lightnvm, particular around pblk.

   - A series from me, adding support for write hints. This comes with
     NVMe support as well, so applications can help guide data placement
     on flash to improve performance, latencies, and write
     amplification.

   - A series from Ming, improving and hardening blk-mq support for
     stopping/starting and quiescing hardware queues.

   - Two pull requests for NVMe updates. Nothing major on the feature
     side, but lots of cleanups and bug fixes. From the usual crew.

   - A series from Neil Brown, greatly improving the bio rescue set
     support. Most notably, this kills the bio rescue work queues, if we
     don't really need them.

   - Lots of other little bug fixes that are all over the place"

* 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits)
  lightnvm: pblk: set line bitmap check under debug
  lightnvm: pblk: verify that cache read is still valid
  lightnvm: pblk: add initialization check
  lightnvm: pblk: remove target using async. I/Os
  lightnvm: pblk: use vmalloc for GC data buffer
  lightnvm: pblk: use right metadata buffer for recovery
  lightnvm: pblk: schedule if data is not ready
  lightnvm: pblk: remove unused return variable
  lightnvm: pblk: fix double-free on pblk init
  lightnvm: pblk: fix bad le64 assignations
  nvme: Makefile: remove dead build rule
  blk-mq: map all HWQ also in hyperthreaded system
  nvmet-rdma: register ib_client to not deadlock in device removal
  nvme_fc: fix error recovery on link down.
  nvmet_fc: fix crashes on bad opcodes
  nvme_fc: Fix crash when nvme controller connection fails.
  nvme_fc: replace ioabort msleep loop with completion
  nvme_fc: fix double calls to nvme_cleanup_cmd()
  nvme-fabrics: verify that a controller returns the correct NQN
  nvme: simplify nvme_dev_attrs_are_visible
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull core block/IO updates from Jens Axboe:
 "This is the main pull request for the block layer for 4.13. Not a huge
  round in terms of features, but there's a lot of churn related to some
  core cleanups.

  Note this depends on the UUID tree pull request, that Christoph
  already sent out.

  This pull request contains:

   - A series from Christoph, unifying the error/stats codes in the
     block layer. We now use blk_status_t everywhere, instead of using
     different schemes for different places.

   - Also from Christoph, some cleanups around request allocation and IO
     scheduler interactions in blk-mq.

   - And yet another series from Christoph, cleaning up how we handle
     and do bounce buffering in the block layer.

   - A blk-mq debugfs series from Bart, further improving on the support
     we have for exporting internal information to aid debugging IO
     hangs or stalls.

   - Also from Bart, a series that cleans up the request initialization
     differences across types of devices.

   - A series from Goldwyn Rodrigues, allowing the block layer to return
     failure if we will block and the user asked for non-blocking.

   - Patch from Hannes for supporting setting loop devices block size to
     that of the underlying device.

   - Two series of patches from Javier, fixing various issues with
     lightnvm, particular around pblk.

   - A series from me, adding support for write hints. This comes with
     NVMe support as well, so applications can help guide data placement
     on flash to improve performance, latencies, and write
     amplification.

   - A series from Ming, improving and hardening blk-mq support for
     stopping/starting and quiescing hardware queues.

   - Two pull requests for NVMe updates. Nothing major on the feature
     side, but lots of cleanups and bug fixes. From the usual crew.

   - A series from Neil Brown, greatly improving the bio rescue set
     support. Most notably, this kills the bio rescue work queues, if we
     don't really need them.

   - Lots of other little bug fixes that are all over the place"

* 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits)
  lightnvm: pblk: set line bitmap check under debug
  lightnvm: pblk: verify that cache read is still valid
  lightnvm: pblk: add initialization check
  lightnvm: pblk: remove target using async. I/Os
  lightnvm: pblk: use vmalloc for GC data buffer
  lightnvm: pblk: use right metadata buffer for recovery
  lightnvm: pblk: schedule if data is not ready
  lightnvm: pblk: remove unused return variable
  lightnvm: pblk: fix double-free on pblk init
  lightnvm: pblk: fix bad le64 assignations
  nvme: Makefile: remove dead build rule
  blk-mq: map all HWQ also in hyperthreaded system
  nvmet-rdma: register ib_client to not deadlock in device removal
  nvme_fc: fix error recovery on link down.
  nvmet_fc: fix crashes on bad opcodes
  nvme_fc: Fix crash when nvme controller connection fails.
  nvme_fc: replace ioabort msleep loop with completion
  nvme_fc: fix double calls to nvme_cleanup_cmd()
  nvme-fabrics: verify that a controller returns the correct NQN
  nvme: simplify nvme_dev_attrs_are_visible
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>dm thin: do not queue freed thin mapping for next stage processing</title>
<updated>2017-06-27T19:14:34+00:00</updated>
<author>
<name>Vallish Vaidyeshwara</name>
<email>vallish@amazon.com</email>
</author>
<published>2017-06-23T18:53:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=00a0ea33b495ee6149bf5a77ac5807ce87323abb'/>
<id>00a0ea33b495ee6149bf5a77ac5807ce87323abb</id>
<content type='text'>
process_prepared_discard_passdown_pt1() should cleanup
dm_thin_new_mapping in cases of error.

dm_pool_inc_data_range() can fail trying to get a block reference:

metadata operation 'dm_pool_inc_data_range' failed: error = -61

When dm_pool_inc_data_range() fails, dm thin aborts current metadata
transaction and marks pool as PM_READ_ONLY. Memory for thin mapping
is released as well. However, current thin mapping will be queued
onto next stage as part of queue_passdown_pt2() or passdown_endio().
This dangling thin mapping memory when processed and accessed in
next stage will lead to device mapper crashing.

Code flow without fix:
-&gt; process_prepared_discard_passdown_pt1(m)
   -&gt; dm_thin_remove_range()
   -&gt; discard passdown
      --&gt; passdown_endio(m) queues m onto next stage
   -&gt; dm_pool_inc_data_range() fails, frees memory m
            but does not remove it from next stage queue

-&gt; process_prepared_discard_passdown_pt2(m)
   -&gt; processes freed memory m and crashes

One such stack:

Call Trace:
[&lt;ffffffffa037a46f&gt;] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison]
[&lt;ffffffffa039b6dc&gt;] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool]
[&lt;ffffffffa039b88b&gt;] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool]
[&lt;ffffffffa0399611&gt;] process_prepared+0x81/0xa0 [dm_thin_pool]
[&lt;ffffffffa039e735&gt;] do_worker+0xc5/0x820 [dm_thin_pool]
[&lt;ffffffff8152bf54&gt;] ? __schedule+0x244/0x680
[&lt;ffffffff81087e72&gt;] ? pwq_activate_delayed_work+0x42/0xb0
[&lt;ffffffff81089f53&gt;] process_one_work+0x153/0x3f0
[&lt;ffffffff8108a71b&gt;] worker_thread+0x12b/0x4b0
[&lt;ffffffff8108a5f0&gt;] ? rescuer_thread+0x350/0x350
[&lt;ffffffff8108fd6a&gt;] kthread+0xca/0xe0
[&lt;ffffffff8108fca0&gt;] ? kthread_park+0x60/0x60
[&lt;ffffffff81530b45&gt;] ret_from_fork+0x25/0x30

The fix is to first take the block ref count for discarded block and
then do a passdown discard of this block. If block ref count fails,
then bail out aborting current metadata transaction, mark pool as
PM_READ_ONLY and also free current thin mapping memory (existing error
handling code) without queueing this thin mapping onto next stage of
processing. If block ref count succeeds, then passdown discard of this
block. Discard callback of passdown_endio() will queue this thin mapping
onto next stage of processing.

Code flow with fix:
-&gt; process_prepared_discard_passdown_pt1(m)
   -&gt; dm_thin_remove_range()
   -&gt; dm_pool_inc_data_range()
      --&gt; if fails, free memory m and bail out
   -&gt; discard passdown
      --&gt; passdown_endio(m) queues m onto next stage

Cc: stable &lt;stable@vger.kernel.org&gt; # v4.9+
Reviewed-by: Eduardo Valentin &lt;eduval@amazon.com&gt;
Reviewed-by: Cristian Gafton &lt;gafton@amazon.com&gt;
Reviewed-by: Anchal Agarwal &lt;anchalag@amazon.com&gt;
Signed-off-by: Vallish Vaidyeshwara &lt;vallish@amazon.com&gt;
Reviewed-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
process_prepared_discard_passdown_pt1() should cleanup
dm_thin_new_mapping in cases of error.

dm_pool_inc_data_range() can fail trying to get a block reference:

metadata operation 'dm_pool_inc_data_range' failed: error = -61

When dm_pool_inc_data_range() fails, dm thin aborts current metadata
transaction and marks pool as PM_READ_ONLY. Memory for thin mapping
is released as well. However, current thin mapping will be queued
onto next stage as part of queue_passdown_pt2() or passdown_endio().
This dangling thin mapping memory when processed and accessed in
next stage will lead to device mapper crashing.

Code flow without fix:
-&gt; process_prepared_discard_passdown_pt1(m)
   -&gt; dm_thin_remove_range()
   -&gt; discard passdown
      --&gt; passdown_endio(m) queues m onto next stage
   -&gt; dm_pool_inc_data_range() fails, frees memory m
            but does not remove it from next stage queue

-&gt; process_prepared_discard_passdown_pt2(m)
   -&gt; processes freed memory m and crashes

One such stack:

Call Trace:
[&lt;ffffffffa037a46f&gt;] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison]
[&lt;ffffffffa039b6dc&gt;] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool]
[&lt;ffffffffa039b88b&gt;] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool]
[&lt;ffffffffa0399611&gt;] process_prepared+0x81/0xa0 [dm_thin_pool]
[&lt;ffffffffa039e735&gt;] do_worker+0xc5/0x820 [dm_thin_pool]
[&lt;ffffffff8152bf54&gt;] ? __schedule+0x244/0x680
[&lt;ffffffff81087e72&gt;] ? pwq_activate_delayed_work+0x42/0xb0
[&lt;ffffffff81089f53&gt;] process_one_work+0x153/0x3f0
[&lt;ffffffff8108a71b&gt;] worker_thread+0x12b/0x4b0
[&lt;ffffffff8108a5f0&gt;] ? rescuer_thread+0x350/0x350
[&lt;ffffffff8108fd6a&gt;] kthread+0xca/0xe0
[&lt;ffffffff8108fca0&gt;] ? kthread_park+0x60/0x60
[&lt;ffffffff81530b45&gt;] ret_from_fork+0x25/0x30

The fix is to first take the block ref count for discarded block and
then do a passdown discard of this block. If block ref count fails,
then bail out aborting current metadata transaction, mark pool as
PM_READ_ONLY and also free current thin mapping memory (existing error
handling code) without queueing this thin mapping onto next stage of
processing. If block ref count succeeds, then passdown discard of this
block. Discard callback of passdown_endio() will queue this thin mapping
onto next stage of processing.

Code flow with fix:
-&gt; process_prepared_discard_passdown_pt1(m)
   -&gt; dm_thin_remove_range()
   -&gt; dm_pool_inc_data_range()
      --&gt; if fails, free memory m and bail out
   -&gt; discard passdown
      --&gt; passdown_endio(m) queues m onto next stage

Cc: stable &lt;stable@vger.kernel.org&gt; # v4.9+
Reviewed-by: Eduardo Valentin &lt;eduval@amazon.com&gt;
Reviewed-by: Cristian Gafton &lt;gafton@amazon.com&gt;
Reviewed-by: Anchal Agarwal &lt;anchalag@amazon.com&gt;
Signed-off-by: Vallish Vaidyeshwara &lt;vallish@amazon.com&gt;
Reviewed-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: switch bios to blk_status_t</title>
<updated>2017-06-09T15:27:32+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-06-03T07:38:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4e4cbee93d56137ebff722be022cae5f70ef84fb'/>
<id>4e4cbee93d56137ebff722be022cae5f70ef84fb</id>
<content type='text'>
Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace bi_error with a new bi_status to allow for a clear conversion.
Note that device mapper overloaded bi_error with a private value, which
we'll have to keep arround at least for now and thus propagate to a
proper blk_status_t value.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: change -&gt;end_io calling convention</title>
<updated>2017-06-09T15:27:32+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-06-03T07:38:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1be5690984588953e759af0a4c6ddac182a1806c'/>
<id>1be5690984588953e759af0a4c6ddac182a1806c</id>
<content type='text'>
Turn the error paramter into a pointer so that target drivers can change
the value, and make sure only DM_ENDIO_* values are returned from the
methods.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Turn the error paramter into a pointer so that target drivers can change
the value, and make sure only DM_ENDIO_* values are returned from the
methods.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-4.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm</title>
<updated>2017-05-03T17:31:20+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-05-03T17:31:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d35a878ae1c50977b55e352fd46e36e35add72a0'/>
<id>d35a878ae1c50977b55e352fd46e36e35add72a0</id>
<content type='text'>
Pull device mapper updates from Mike Snitzer:

 - A major update for DM cache that reduces the latency for deciding
   whether blocks should migrate to/from the cache. The bio-prison-v2
   interface supports this improvement by enabling direct dispatch of
   work to workqueues rather than having to delay the actual work
   dispatch to the DM cache core. So the dm-cache policies are much more
   nimble by being able to drive IO as they see fit. One immediate
   benefit from the improved latency is a cache that should be much more
   adaptive to changing workloads.

 - Add a new DM integrity target that emulates a block device that has
   additional per-sector tags that can be used for storing integrity
   information.

 - Add a new authenticated encryption feature to the DM crypt target
   that builds on the capabilities provided by the DM integrity target.

 - Add MD interface for switching the raid4/5/6 journal mode and update
   the DM raid target to use it to enable aid4/5/6 journal write-back
   support.

 - Switch the DM verity target over to using the asynchronous hash
   crypto API (this helps work better with architectures that have
   access to off-CPU algorithm providers, which should reduce CPU
   utilization).

 - Various request-based DM and DM multipath fixes and improvements from
   Bart and Christoph.

 - A DM thinp target fix for a bio structure leak that occurs for each
   discard IFF discard passdown is enabled.

 - A fix for a possible deadlock in DM bufio and a fix to re-check the
   new buffer allocation watermark in the face of competing admin
   changes to the 'max_cache_size_bytes' tunable.

 - A couple DM core cleanups.

* tag 'for-4.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (50 commits)
  dm bufio: check new buffer allocation watermark every 30 seconds
  dm bufio: avoid a possible ABBA deadlock
  dm mpath: make it easier to detect unintended I/O request flushes
  dm mpath: cleanup QUEUE_IF_NO_PATH bit manipulation by introducing assign_bit()
  dm mpath: micro-optimize the hot path relative to MPATHF_QUEUE_IF_NO_PATH
  dm: introduce enum dm_queue_mode to cleanup related code
  dm mpath: verify __pg_init_all_paths locking assumptions at runtime
  dm: verify suspend_locking assumptions at runtime
  dm block manager: remove an unused argument from dm_block_manager_create()
  dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue()
  dm mpath: delay requeuing while path initialization is in progress
  dm mpath: avoid that path removal can trigger an infinite loop
  dm mpath: split and rename activate_path() to prepare for its expanded use
  dm ioctl: prevent stack leak in dm ioctl call
  dm integrity: use previously calculated log2 of sectors_per_block
  dm integrity: use hex2bin instead of open-coded variant
  dm crypt: replace custom implementation of hex2bin()
  dm crypt: remove obsolete references to per-CPU state
  dm verity: switch to using asynchronous hash crypto API
  dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull device mapper updates from Mike Snitzer:

 - A major update for DM cache that reduces the latency for deciding
   whether blocks should migrate to/from the cache. The bio-prison-v2
   interface supports this improvement by enabling direct dispatch of
   work to workqueues rather than having to delay the actual work
   dispatch to the DM cache core. So the dm-cache policies are much more
   nimble by being able to drive IO as they see fit. One immediate
   benefit from the improved latency is a cache that should be much more
   adaptive to changing workloads.

 - Add a new DM integrity target that emulates a block device that has
   additional per-sector tags that can be used for storing integrity
   information.

 - Add a new authenticated encryption feature to the DM crypt target
   that builds on the capabilities provided by the DM integrity target.

 - Add MD interface for switching the raid4/5/6 journal mode and update
   the DM raid target to use it to enable aid4/5/6 journal write-back
   support.

 - Switch the DM verity target over to using the asynchronous hash
   crypto API (this helps work better with architectures that have
   access to off-CPU algorithm providers, which should reduce CPU
   utilization).

 - Various request-based DM and DM multipath fixes and improvements from
   Bart and Christoph.

 - A DM thinp target fix for a bio structure leak that occurs for each
   discard IFF discard passdown is enabled.

 - A fix for a possible deadlock in DM bufio and a fix to re-check the
   new buffer allocation watermark in the face of competing admin
   changes to the 'max_cache_size_bytes' tunable.

 - A couple DM core cleanups.

* tag 'for-4.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (50 commits)
  dm bufio: check new buffer allocation watermark every 30 seconds
  dm bufio: avoid a possible ABBA deadlock
  dm mpath: make it easier to detect unintended I/O request flushes
  dm mpath: cleanup QUEUE_IF_NO_PATH bit manipulation by introducing assign_bit()
  dm mpath: micro-optimize the hot path relative to MPATHF_QUEUE_IF_NO_PATH
  dm: introduce enum dm_queue_mode to cleanup related code
  dm mpath: verify __pg_init_all_paths locking assumptions at runtime
  dm: verify suspend_locking assumptions at runtime
  dm block manager: remove an unused argument from dm_block_manager_create()
  dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue()
  dm mpath: delay requeuing while path initialization is in progress
  dm mpath: avoid that path removal can trigger an infinite loop
  dm mpath: split and rename activate_path() to prepare for its expanded use
  dm ioctl: prevent stack leak in dm ioctl call
  dm integrity: use previously calculated log2 of sectors_per_block
  dm integrity: use hex2bin instead of open-coded variant
  dm crypt: replace custom implementation of hex2bin()
  dm crypt: remove obsolete references to per-CPU state
  dm verity: switch to using asynchronous hash crypto API
  dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>dm thin: fix a memory leak when passing discard bio down</title>
<updated>2017-04-24T18:58:10+00:00</updated>
<author>
<name>Dennis Yang</name>
<email>dennisyang@qnap.com</email>
</author>
<published>2017-04-18T07:27:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=948f581a53b704b984aa20df009f0a2b4cf7f907'/>
<id>948f581a53b704b984aa20df009f0a2b4cf7f907</id>
<content type='text'>
dm-thin does not free the discard_parent bio after all chained sub
bios finished. The following kmemleak report could be observed after
pool with discard_passdown option processes discard bios in
linux v4.11-rc7. To fix this, we drop the discard_parent bio reference
when its endio (passdown_endio) called.

unreferenced object 0xffff8803d6b29700 (size 256):
  comm "kworker/u8:0", pid 30349, jiffies 4379504020 (age 143002.776s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01 00 00 00 00 00 00 f0 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff81a5efd9&gt;] kmemleak_alloc+0x49/0xa0
    [&lt;ffffffff8114ec34&gt;] kmem_cache_alloc+0xb4/0x100
    [&lt;ffffffff8110eec0&gt;] mempool_alloc_slab+0x10/0x20
    [&lt;ffffffff8110efa5&gt;] mempool_alloc+0x55/0x150
    [&lt;ffffffff81374939&gt;] bio_alloc_bioset+0xb9/0x260
    [&lt;ffffffffa018fd20&gt;] process_prepared_discard_passdown_pt1+0x40/0x1c0 [dm_thin_pool]
    [&lt;ffffffffa018b409&gt;] break_up_discard_bio+0x1a9/0x200 [dm_thin_pool]
    [&lt;ffffffffa018b484&gt;] process_discard_cell_passdown+0x24/0x40 [dm_thin_pool]
    [&lt;ffffffffa018b24d&gt;] process_discard_bio+0xdd/0xf0 [dm_thin_pool]
    [&lt;ffffffffa018ecf6&gt;] do_worker+0xa76/0xd50 [dm_thin_pool]
    [&lt;ffffffff81086239&gt;] process_one_work+0x139/0x370
    [&lt;ffffffff810867b1&gt;] worker_thread+0x61/0x450
    [&lt;ffffffff8108b316&gt;] kthread+0xd6/0xf0
    [&lt;ffffffff81a6cd1f&gt;] ret_from_fork+0x3f/0x70
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Cc: stable@vger.kernel.org
Signed-off-by: Dennis Yang &lt;dennisyang@qnap.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
dm-thin does not free the discard_parent bio after all chained sub
bios finished. The following kmemleak report could be observed after
pool with discard_passdown option processes discard bios in
linux v4.11-rc7. To fix this, we drop the discard_parent bio reference
when its endio (passdown_endio) called.

unreferenced object 0xffff8803d6b29700 (size 256):
  comm "kworker/u8:0", pid 30349, jiffies 4379504020 (age 143002.776s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01 00 00 00 00 00 00 f0 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff81a5efd9&gt;] kmemleak_alloc+0x49/0xa0
    [&lt;ffffffff8114ec34&gt;] kmem_cache_alloc+0xb4/0x100
    [&lt;ffffffff8110eec0&gt;] mempool_alloc_slab+0x10/0x20
    [&lt;ffffffff8110efa5&gt;] mempool_alloc+0x55/0x150
    [&lt;ffffffff81374939&gt;] bio_alloc_bioset+0xb9/0x260
    [&lt;ffffffffa018fd20&gt;] process_prepared_discard_passdown_pt1+0x40/0x1c0 [dm_thin_pool]
    [&lt;ffffffffa018b409&gt;] break_up_discard_bio+0x1a9/0x200 [dm_thin_pool]
    [&lt;ffffffffa018b484&gt;] process_discard_cell_passdown+0x24/0x40 [dm_thin_pool]
    [&lt;ffffffffa018b24d&gt;] process_discard_bio+0xdd/0xf0 [dm_thin_pool]
    [&lt;ffffffffa018ecf6&gt;] do_worker+0xa76/0xd50 [dm_thin_pool]
    [&lt;ffffffff81086239&gt;] process_one_work+0x139/0x370
    [&lt;ffffffff810867b1&gt;] worker_thread+0x61/0x450
    [&lt;ffffffff8108b316&gt;] kthread+0xd6/0xf0
    [&lt;ffffffff81a6cd1f&gt;] ret_from_fork+0x3f/0x70
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Cc: stable@vger.kernel.org
Signed-off-by: Dennis Yang &lt;dennisyang@qnap.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: remove the discard_zeroes_data flag</title>
<updated>2017-04-08T17:25:38+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2017-04-05T17:21:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=48920ff2a5a940cd07d12cc79e4a2c75f1185aee'/>
<id>48920ff2a5a940cd07d12cc79e4a2c75f1185aee</id>
<content type='text'>
Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can
kill this hack.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
