<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/linux/blk-mq.h, branch v5.2-rc5</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>blk-mq: always free hctx after request queue is freed</title>
<updated>2019-05-04T13:24:08+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-04-30T01:52:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2f8f1336a48bd5186de3476da0a3e2ec06d0533a'/>
<id>2f8f1336a48bd5186de3476da0a3e2ec06d0533a</id>
<content type='text'>
In normal queue cleanup path, hctx is released after request queue
is freed, see blk_mq_release().

However, in __blk_mq_update_nr_hw_queues(), hctx may be freed because
of hw queues shrinking. This way is easy to cause use-after-free,
because: one implicit rule is that it is safe to call almost all block
layer APIs if the request queue is alive; and one hctx may be retrieved
by one API, then the hctx can be freed by blk_mq_update_nr_hw_queues();
finally use-after-free is triggered.

Fixes this issue by always freeing hctx after releasing request queue.
If some hctxs are removed in blk_mq_update_nr_hw_queues(), introduce
a per-queue list to hold them, then try to resuse these hctxs if numa
node is matched.

Cc: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: linux-scsi@vger.kernel.org,
Cc: Martin K . Petersen &lt;martin.petersen@oracle.com&gt;,
Cc: Christoph Hellwig &lt;hch@lst.de&gt;,
Cc: James E . J . Bottomley &lt;jejb@linux.vnet.ibm.com&gt;,
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Tested-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In normal queue cleanup path, hctx is released after request queue
is freed, see blk_mq_release().

However, in __blk_mq_update_nr_hw_queues(), hctx may be freed because
of hw queues shrinking. This way is easy to cause use-after-free,
because: one implicit rule is that it is safe to call almost all block
layer APIs if the request queue is alive; and one hctx may be retrieved
by one API, then the hctx can be freed by blk_mq_update_nr_hw_queues();
finally use-after-free is triggered.

Fixes this issue by always freeing hctx after releasing request queue.
If some hctxs are removed in blk_mq_update_nr_hw_queues(), introduce
a per-queue list to hold them, then try to resuse these hctxs if numa
node is matched.

Cc: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: linux-scsi@vger.kernel.org,
Cc: Martin K . Petersen &lt;martin.petersen@oracle.com&gt;,
Cc: Christoph Hellwig &lt;hch@lst.de&gt;,
Cc: James E . J . Bottomley &lt;jejb@linux.vnet.ibm.com&gt;,
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Tested-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: introduce blk_mq_complete_request_sync()</title>
<updated>2019-04-10T15:57:33+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-04-08T22:31:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1b8f21b74c3c9c82fce5a751d7aefb7cc0b8d33d'/>
<id>1b8f21b74c3c9c82fce5a751d7aefb7cc0b8d33d</id>
<content type='text'>
In NVMe's error handler, follows the typical steps of tearing down
hardware for recovering controller:

1) stop blk_mq hw queues
2) stop the real hw queues
3) cancel in-flight requests via
	blk_mq_tagset_busy_iter(tags, cancel_request, ...)
cancel_request():
	mark the request as abort
	blk_mq_complete_request(req);
4) destroy real hw queues

However, there may be race between #3 and #4, because blk_mq_complete_request()
may run q-&gt;mq_ops-&gt;complete(rq) remotelly and asynchronously, and
-&gt;complete(rq) may be run after #4.

This patch introduces blk_mq_complete_request_sync() for fixing the
above race.

Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: linux-nvme@lists.infradead.org
Reviewed-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In NVMe's error handler, follows the typical steps of tearing down
hardware for recovering controller:

1) stop blk_mq hw queues
2) stop the real hw queues
3) cancel in-flight requests via
	blk_mq_tagset_busy_iter(tags, cancel_request, ...)
cancel_request():
	mark the request as abort
	blk_mq_complete_request(req);
4) destroy real hw queues

However, there may be race between #3 and #4, because blk_mq_complete_request()
may run q-&gt;mq_ops-&gt;complete(rq) remotelly and asynchronously, and
-&gt;complete(rq) may be run after #4.

This patch introduces blk_mq_complete_request_sync() for fixing the
above race.

Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: Bart Van Assche &lt;bvanassche@acm.org&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: linux-nvme@lists.infradead.org
Reviewed-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: Unexport blk_mq_add_to_requeue_list()</title>
<updated>2019-03-20T20:19:36+00:00</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2019-03-20T20:14:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e6c987120e24cb913cb7bd4e675129a30fa49e0d'/>
<id>e6c987120e24cb913cb7bd4e675129a30fa49e0d</id>
<content type='text'>
This function is not used outside the block layer core. Hence unexport it.

Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This function is not used outside the block layer core. Hence unexport it.

Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: remove unused 'nr_expired' from blk_mq_hw_ctx</title>
<updated>2019-03-19T15:04:06+00:00</updated>
<author>
<name>Dongli Zhang</name>
<email>dongli.zhang@oracle.com</email>
</author>
<published>2019-03-19T15:05:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9496c015ed39ddfce971d63a1442e6d258504a7d'/>
<id>9496c015ed39ddfce971d63a1442e6d258504a7d</id>
<content type='text'>
There is no usage of 'nr_expired'.

The 'nr_expired' was introduced by commit 1d9bd5161ba3 ("blk-mq: replace
timeout synchronization with a RCU and generation based scheme"). Its usage
was removed since commit 12f5b9314545 ("blk-mq: Remove generation
seqeunce").

Signed-off-by: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no usage of 'nr_expired'.

The 'nr_expired' was introduced by commit 1d9bd5161ba3 ("blk-mq: replace
timeout synchronization with a RCU and generation based scheme"). Its usage
was removed since commit 12f5b9314545 ("blk-mq: Remove generation
seqeunce").

Signed-off-by: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: kill BLK_MQ_F_SG_MERGE</title>
<updated>2019-02-15T15:40:12+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-02-15T11:13:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=56d18f62f556b80105e38e7975975cf7465aae3e'/>
<id>56d18f62f556b80105e38e7975975cf7465aae3e</id>
<content type='text'>
QUEUE_FLAG_NO_SG_MERGE has been killed, so kill BLK_MQ_F_SG_MERGE too.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
QUEUE_FLAG_NO_SG_MERGE has been killed, so kill BLK_MQ_F_SG_MERGE too.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: make request_to_qc_t public</title>
<updated>2018-12-18T16:50:47+00:00</updated>
<author>
<name>Sagi Grimberg</name>
<email>sagi@grimberg.me</email>
</author>
<published>2018-12-14T19:06:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7b7ab780a048699d2b9f416bf2d5c089d8d1028c'/>
<id>7b7ab780a048699d2b9f416bf2d5c089d8d1028c</id>
<content type='text'>
block consumers will need it for polling requests that
are sent with blk_execute_rq_nowait. Also, get rid of
blk_tag_to_qc_t and open-code it instead.

Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
block consumers will need it for polling requests that
are sent with blk_execute_rq_nowait. Also, get rid of
blk_tag_to_qc_t and open-code it instead.

Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()</title>
<updated>2018-12-18T04:31:42+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2018-12-18T04:11:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3c94d83cb352627f221d971b05f163c17527de74'/>
<id>3c94d83cb352627f221d971b05f163c17527de74</id>
<content type='text'>
There's a single user of this function, dm, and dm just wants
to check if IO is inflight, not that it's just allocated.

This fixes a hang with srp/002 in blktests with dm, where it tries
to suspend but waits for inflight IO to finish first. As it checks
for just allocated requests, this fails.

Tested-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's a single user of this function, dm, and dm just wants
to check if IO is inflight, not that it's just allocated.

This fixes a hang with srp/002 in blktests with dm, where it tries
to suspend but waits for inflight IO to finish first. As it checks
for just allocated requests, this fails.

Tested-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: move queues types to the block layer</title>
<updated>2018-12-04T18:38:17+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-12-02T16:46:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e20ba6e1da029136ded295f33076483d65ddf50a'/>
<id>e20ba6e1da029136ded295f33076483d65ddf50a</id>
<content type='text'>
Having another indirect all in the fast path doesn't really help
in our post-spectre world.  Also having too many queue type is just
going to create confusion, so I'd rather manage them centrally.

Note that the queue type naming and ordering changes a bit - the
first index now is the default queue for everything not explicitly
marked, the optional ones are read and poll queues.

Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Having another indirect all in the fast path doesn't really help
in our post-spectre world.  Also having too many queue type is just
going to create confusion, so I'd rather manage them centrally.

Note that the queue type naming and ordering changes a bit - the
first index now is the default queue for everything not explicitly
marked, the optional ones are read and poll queues.

Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: add mq_ops-&gt;commit_rqs()</title>
<updated>2018-11-29T17:11:56+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2018-11-28T00:02:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d666ba98f849ad44c4405ecc2180390ebe80f4f9'/>
<id>d666ba98f849ad44c4405ecc2180390ebe80f4f9</id>
<content type='text'>
blk-mq passes information to the hardware about any given request being
the last that we will issue in this sequence. The point is that hardware
can defer costly doorbell type writes to the last request. But if we run
into errors issuing a sequence of requests, we may never send the request
with bd-&gt;last == true set. For that case, we need a hook that tells the
hardware that nothing else is coming right now.

For failures returned by the drivers -&gt;queue_rq() hook, the driver is
responsible for flushing pending requests, if it uses bd-&gt;last to
optimize that part. This works like before, no changes there.

Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
blk-mq passes information to the hardware about any given request being
the last that we will issue in this sequence. The point is that hardware
can defer costly doorbell type writes to the last request. But if we run
into errors issuing a sequence of requests, we may never send the request
with bd-&gt;last == true set. For that case, we need a hook that tells the
hardware that nothing else is coming right now.

For failures returned by the drivers -&gt;queue_rq() hook, the driver is
responsible for flushing pending requests, if it uses bd-&gt;last to
optimize that part. This works like before, no changes there.

Reviewed-by: Omar Sandoval &lt;osandov@fb.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: Simplify request completion state</title>
<updated>2018-11-26T17:34:27+00:00</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2018-11-26T16:54:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=af78ff7c6e66832afcdf5418f67b11c409f9e7a1'/>
<id>af78ff7c6e66832afcdf5418f67b11c409f9e7a1</id>
<content type='text'>
There are no more users relying on blk-mq request states to prevent
double completions, so replace the relatively expensive cmpxchg operation
with WRITE_ONCE.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are no more users relying on blk-mq request states to prevent
double completions, so replace the relatively expensive cmpxchg operation
with WRITE_ONCE.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
</feed>
