<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/block/blk.h, branch linux-3.18.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>blk-mq: fix race between timeout and freeing request</title>
<updated>2017-11-08T09:03:48+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2015-08-09T07:41:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b6885d31d1c6b6f4ccd50535d24dbe5c3d8a7d7b'/>
<id>b6885d31d1c6b6f4ccd50535d24dbe5c3d8a7d7b</id>
<content type='text'>
commit 0048b4837affd153897ed1222283492070027aa9 upstream.

Inside timeout handler, blk_mq_tag_to_rq() is called
to retrieve the request from one tag. This way is obviously
wrong because the request can be freed any time and some
fiedds of the request can't be trusted, then kernel oops
might be triggered[1].

Currently wrt. blk_mq_tag_to_rq(), the only special case is
that the flush request can share same tag with the request
cloned from, and the two requests can't be active at the same
time, so this patch fixes the above issue by updating tags-&gt;rqs[tag]
with the active request(either flush rq or the request cloned
from) of the tag.

Also blk_mq_tag_to_rq() gets much simplified with this patch.

Given blk_mq_tag_to_rq() is mainly for drivers and the caller must
make sure the request can't be freed, so in bt_for_each() this
helper is replaced with tags-&gt;rqs[tag].

[1] kernel oops log
[  439.696220] BUG: unable to handle kernel NULL pointer dereference at 0000000000000158^M
[  439.697162] IP: [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.700653] PGD 7ef765067 PUD 7ef764067 PMD 0 ^M
[  439.700653] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M
[  439.700653] Dumping ftrace buffer:^M
[  439.700653]    (ftrace buffer empty)^M
[  439.700653] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M
[  439.700653] CPU: 6 PID: 2779 Comm: stress-ng-sigfd Not tainted 4.2.0-rc5-next-20150805+ #265^M
[  439.730500] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M
[  439.730500] task: ffff880605308000 ti: ffff88060530c000 task.ti: ffff88060530c000^M
[  439.730500] RIP: 0010:[&lt;ffffffff812d89ba&gt;]  [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.730500] RSP: 0018:ffff880819203da0  EFLAGS: 00010283^M
[  439.730500] RAX: ffff880811b0e000 RBX: ffff8800bb465f00 RCX: 0000000000000002^M
[  439.730500] RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000^M
[  439.730500] RBP: ffff880819203db0 R08: 0000000000000002 R09: 0000000000000000^M
[  439.730500] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000202^M
[  439.730500] R13: ffff880814104800 R14: 0000000000000002 R15: ffff880811a2ea00^M
[  439.730500] FS:  00007f165b3f5740(0000) GS:ffff880819200000(0000) knlGS:0000000000000000^M
[  439.730500] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b^M
[  439.730500] CR2: 0000000000000158 CR3: 00000007ef766000 CR4: 00000000000006e0^M
[  439.730500] Stack:^M
[  439.730500]  0000000000000008 ffff8808114eed90 ffff880819203e00 ffffffff812dc104^M
[  439.755663]  ffff880819203e40 ffffffff812d9f5e 0000020000000000 ffff8808114eed80^M
[  439.755663] Call Trace:^M
[  439.755663]  &lt;IRQ&gt; ^M
[  439.755663]  [&lt;ffffffff812dc104&gt;] bt_for_each+0x6e/0xc8^M
[  439.755663]  [&lt;ffffffff812d9f5e&gt;] ? blk_mq_rq_timed_out+0x6a/0x6a^M
[  439.755663]  [&lt;ffffffff812d9f5e&gt;] ? blk_mq_rq_timed_out+0x6a/0x6a^M
[  439.755663]  [&lt;ffffffff812dc1b3&gt;] blk_mq_tag_busy_iter+0x55/0x5e^M
[  439.755663]  [&lt;ffffffff812d88b4&gt;] ? blk_mq_bio_to_request+0x38/0x38^M
[  439.755663]  [&lt;ffffffff812d8911&gt;] blk_mq_rq_timer+0x5d/0xd4^M
[  439.755663]  [&lt;ffffffff810a3e10&gt;] call_timer_fn+0xf7/0x284^M
[  439.755663]  [&lt;ffffffff810a3d1e&gt;] ? call_timer_fn+0x5/0x284^M
[  439.755663]  [&lt;ffffffff812d88b4&gt;] ? blk_mq_bio_to_request+0x38/0x38^M
[  439.755663]  [&lt;ffffffff810a46d6&gt;] run_timer_softirq+0x1ce/0x1f8^M
[  439.755663]  [&lt;ffffffff8104c367&gt;] __do_softirq+0x181/0x3a4^M
[  439.755663]  [&lt;ffffffff8104c76e&gt;] irq_exit+0x40/0x94^M
[  439.755663]  [&lt;ffffffff81031482&gt;] smp_apic_timer_interrupt+0x33/0x3e^M
[  439.755663]  [&lt;ffffffff815559a4&gt;] apic_timer_interrupt+0x84/0x90^M
[  439.755663]  &lt;EOI&gt; ^M
[  439.755663]  [&lt;ffffffff81554350&gt;] ? _raw_spin_unlock_irq+0x32/0x4a^M
[  439.755663]  [&lt;ffffffff8106a98b&gt;] finish_task_switch+0xe0/0x163^M
[  439.755663]  [&lt;ffffffff8106a94d&gt;] ? finish_task_switch+0xa2/0x163^M
[  439.755663]  [&lt;ffffffff81550066&gt;] __schedule+0x469/0x6cd^M
[  439.755663]  [&lt;ffffffff8155039b&gt;] schedule+0x82/0x9a^M
[  439.789267]  [&lt;ffffffff8119b28b&gt;] signalfd_read+0x186/0x49a^M
[  439.790911]  [&lt;ffffffff8106d86a&gt;] ? wake_up_q+0x47/0x47^M
[  439.790911]  [&lt;ffffffff811618c2&gt;] __vfs_read+0x28/0x9f^M
[  439.790911]  [&lt;ffffffff8117a289&gt;] ? __fget_light+0x4d/0x74^M
[  439.790911]  [&lt;ffffffff811620a7&gt;] vfs_read+0x7a/0xc6^M
[  439.790911]  [&lt;ffffffff8116292b&gt;] SyS_read+0x49/0x7f^M
[  439.790911]  [&lt;ffffffff81554c17&gt;] entry_SYSCALL_64_fastpath+0x12/0x6f^M
[  439.790911] Code: 48 89 e5 e8 a9 b8 e7 ff 5d c3 0f 1f 44 00 00 55 89
f2 48 89 e5 41 54 41 89 f4 53 48 8b 47 60 48 8b 1c d0 48 8b 7b 30 48 8b
53 38 &lt;48&gt; 8b 87 58 01 00 00 48 85 c0 75 09 48 8b 97 88 0c 00 00 eb 10
^M
[  439.790911] RIP  [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.790911]  RSP &lt;ffff880819203da0&gt;^M
[  439.790911] CR2: 0000000000000158^M
[  439.790911] ---[ end trace d40af58949325661 ]---^M

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Dmitry Shmidt &lt;dimitrysh@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0048b4837affd153897ed1222283492070027aa9 upstream.

Inside timeout handler, blk_mq_tag_to_rq() is called
to retrieve the request from one tag. This way is obviously
wrong because the request can be freed any time and some
fiedds of the request can't be trusted, then kernel oops
might be triggered[1].

Currently wrt. blk_mq_tag_to_rq(), the only special case is
that the flush request can share same tag with the request
cloned from, and the two requests can't be active at the same
time, so this patch fixes the above issue by updating tags-&gt;rqs[tag]
with the active request(either flush rq or the request cloned
from) of the tag.

Also blk_mq_tag_to_rq() gets much simplified with this patch.

Given blk_mq_tag_to_rq() is mainly for drivers and the caller must
make sure the request can't be freed, so in bt_for_each() this
helper is replaced with tags-&gt;rqs[tag].

[1] kernel oops log
[  439.696220] BUG: unable to handle kernel NULL pointer dereference at 0000000000000158^M
[  439.697162] IP: [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.700653] PGD 7ef765067 PUD 7ef764067 PMD 0 ^M
[  439.700653] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M
[  439.700653] Dumping ftrace buffer:^M
[  439.700653]    (ftrace buffer empty)^M
[  439.700653] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M
[  439.700653] CPU: 6 PID: 2779 Comm: stress-ng-sigfd Not tainted 4.2.0-rc5-next-20150805+ #265^M
[  439.730500] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M
[  439.730500] task: ffff880605308000 ti: ffff88060530c000 task.ti: ffff88060530c000^M
[  439.730500] RIP: 0010:[&lt;ffffffff812d89ba&gt;]  [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.730500] RSP: 0018:ffff880819203da0  EFLAGS: 00010283^M
[  439.730500] RAX: ffff880811b0e000 RBX: ffff8800bb465f00 RCX: 0000000000000002^M
[  439.730500] RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000^M
[  439.730500] RBP: ffff880819203db0 R08: 0000000000000002 R09: 0000000000000000^M
[  439.730500] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000202^M
[  439.730500] R13: ffff880814104800 R14: 0000000000000002 R15: ffff880811a2ea00^M
[  439.730500] FS:  00007f165b3f5740(0000) GS:ffff880819200000(0000) knlGS:0000000000000000^M
[  439.730500] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b^M
[  439.730500] CR2: 0000000000000158 CR3: 00000007ef766000 CR4: 00000000000006e0^M
[  439.730500] Stack:^M
[  439.730500]  0000000000000008 ffff8808114eed90 ffff880819203e00 ffffffff812dc104^M
[  439.755663]  ffff880819203e40 ffffffff812d9f5e 0000020000000000 ffff8808114eed80^M
[  439.755663] Call Trace:^M
[  439.755663]  &lt;IRQ&gt; ^M
[  439.755663]  [&lt;ffffffff812dc104&gt;] bt_for_each+0x6e/0xc8^M
[  439.755663]  [&lt;ffffffff812d9f5e&gt;] ? blk_mq_rq_timed_out+0x6a/0x6a^M
[  439.755663]  [&lt;ffffffff812d9f5e&gt;] ? blk_mq_rq_timed_out+0x6a/0x6a^M
[  439.755663]  [&lt;ffffffff812dc1b3&gt;] blk_mq_tag_busy_iter+0x55/0x5e^M
[  439.755663]  [&lt;ffffffff812d88b4&gt;] ? blk_mq_bio_to_request+0x38/0x38^M
[  439.755663]  [&lt;ffffffff812d8911&gt;] blk_mq_rq_timer+0x5d/0xd4^M
[  439.755663]  [&lt;ffffffff810a3e10&gt;] call_timer_fn+0xf7/0x284^M
[  439.755663]  [&lt;ffffffff810a3d1e&gt;] ? call_timer_fn+0x5/0x284^M
[  439.755663]  [&lt;ffffffff812d88b4&gt;] ? blk_mq_bio_to_request+0x38/0x38^M
[  439.755663]  [&lt;ffffffff810a46d6&gt;] run_timer_softirq+0x1ce/0x1f8^M
[  439.755663]  [&lt;ffffffff8104c367&gt;] __do_softirq+0x181/0x3a4^M
[  439.755663]  [&lt;ffffffff8104c76e&gt;] irq_exit+0x40/0x94^M
[  439.755663]  [&lt;ffffffff81031482&gt;] smp_apic_timer_interrupt+0x33/0x3e^M
[  439.755663]  [&lt;ffffffff815559a4&gt;] apic_timer_interrupt+0x84/0x90^M
[  439.755663]  &lt;EOI&gt; ^M
[  439.755663]  [&lt;ffffffff81554350&gt;] ? _raw_spin_unlock_irq+0x32/0x4a^M
[  439.755663]  [&lt;ffffffff8106a98b&gt;] finish_task_switch+0xe0/0x163^M
[  439.755663]  [&lt;ffffffff8106a94d&gt;] ? finish_task_switch+0xa2/0x163^M
[  439.755663]  [&lt;ffffffff81550066&gt;] __schedule+0x469/0x6cd^M
[  439.755663]  [&lt;ffffffff8155039b&gt;] schedule+0x82/0x9a^M
[  439.789267]  [&lt;ffffffff8119b28b&gt;] signalfd_read+0x186/0x49a^M
[  439.790911]  [&lt;ffffffff8106d86a&gt;] ? wake_up_q+0x47/0x47^M
[  439.790911]  [&lt;ffffffff811618c2&gt;] __vfs_read+0x28/0x9f^M
[  439.790911]  [&lt;ffffffff8117a289&gt;] ? __fget_light+0x4d/0x74^M
[  439.790911]  [&lt;ffffffff811620a7&gt;] vfs_read+0x7a/0xc6^M
[  439.790911]  [&lt;ffffffff8116292b&gt;] SyS_read+0x49/0x7f^M
[  439.790911]  [&lt;ffffffff81554c17&gt;] entry_SYSCALL_64_fastpath+0x12/0x6f^M
[  439.790911] Code: 48 89 e5 e8 a9 b8 e7 ff 5d c3 0f 1f 44 00 00 55 89
f2 48 89 e5 41 54 41 89 f4 53 48 8b 47 60 48 8b 1c d0 48 8b 7b 30 48 8b
53 38 &lt;48&gt; 8b 87 58 01 00 00 48 85 c0 75 09 48 8b 97 88 0c 00 00 eb 10
^M
[  439.790911] RIP  [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.790911]  RSP &lt;ffff880819203da0&gt;^M
[  439.790911] CR2: 0000000000000158^M
[  439.790911] ---[ end trace d40af58949325661 ]---^M

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Dmitry Shmidt &lt;dimitrysh@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: support per-distpatch_queue flush machinery</title>
<updated>2014-09-25T21:22:45+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2014-09-25T15:23:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f70ced09170761acb69840cafaace4abc72cba4b'/>
<id>f70ced09170761acb69840cafaace4abc72cba4b</id>
<content type='text'>
This patch supports to run one single flush machinery for
each blk-mq dispatch queue, so that:

- current init_request and exit_request callbacks can
cover flush request too, then the buggy copying way of
initializing flush request's pdu can be fixed

- flushing performance gets improved in case of multi hw-queue

In fio sync write test over virtio-blk(4 hw queues, ioengine=sync,
iodepth=64, numjobs=4, bs=4K), it is observed that througput gets
increased a lot over my test environment:
	- throughput: +70% in case of virtio-blk over null_blk
	- throughput: +30% in case of virtio-blk over SSD image

The multi virtqueue feature isn't merged to QEMU yet, and patches for
the feature can be found in below tree:

	git://kernel.ubuntu.com/ming/qemu.git  	v2.1.0-mq.4

And simply passing 'num_queues=4 vectors=5' should be enough to
enable multi queue(quad queue) feature for QEMU virtio-blk.

Suggested-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch supports to run one single flush machinery for
each blk-mq dispatch queue, so that:

- current init_request and exit_request callbacks can
cover flush request too, then the buggy copying way of
initializing flush request's pdu can be fixed

- flushing performance gets improved in case of multi hw-queue

In fio sync write test over virtio-blk(4 hw queues, ioengine=sync,
iodepth=64, numjobs=4, bs=4K), it is observed that througput gets
increased a lot over my test environment:
	- throughput: +70% in case of virtio-blk over null_blk
	- throughput: +30% in case of virtio-blk over SSD image

The multi virtqueue feature isn't merged to QEMU yet, and patches for
the feature can be found in below tree:

	git://kernel.ubuntu.com/ming/qemu.git  	v2.1.0-mq.4

And simply passing 'num_queues=4 vectors=5' should be enough to
enable multi queue(quad queue) feature for QEMU virtio-blk.

Suggested-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: introduce 'blk_mq_ctx' parameter to blk_get_flush_queue</title>
<updated>2014-09-25T21:22:44+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2014-09-25T15:23:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e97c293cdf77263abdc021de280516e0017afc84'/>
<id>e97c293cdf77263abdc021de280516e0017afc84</id>
<content type='text'>
This patch adds 'blk_mq_ctx' parameter to blk_get_flush_queue(),
so that this function can find the corresponding blk_flush_queue
bound with current mq context since the flush queue will become
per hw-queue.

For legacy queue, the parameter can be simply 'NULL'.

For multiqueue case, the parameter should be set as the context
from which the related request is originated. With this context
info, the hw queue and related flush queue can be found easily.

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds 'blk_mq_ctx' parameter to blk_get_flush_queue(),
so that this function can find the corresponding blk_flush_queue
bound with current mq context since the flush queue will become
per hw-queue.

For legacy queue, the parameter can be simply 'NULL'.

For multiqueue case, the parameter should be set as the context
from which the related request is originated. With this context
info, the hw queue and related flush queue can be found easily.

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: remove blk_init_flush() and its pair</title>
<updated>2014-09-25T21:22:41+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2014-09-25T15:23:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ba483388e3058b3e412632a84e6bf1f134beaf3d'/>
<id>ba483388e3058b3e412632a84e6bf1f134beaf3d</id>
<content type='text'>
Now mission of the two helpers is over, and just call
blk_alloc_flush_queue() and blk_free_flush_queue() directly.

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now mission of the two helpers is over, and just call
blk_alloc_flush_queue() and blk_free_flush_queue() directly.

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: introduce blk_flush_queue to drive flush machinery</title>
<updated>2014-09-25T21:22:40+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2014-09-25T15:23:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7c94e1c157a227837b04f02f5edeff8301410ba2'/>
<id>7c94e1c157a227837b04f02f5edeff8301410ba2</id>
<content type='text'>
This patch introduces 'struct blk_flush_queue' and puts all
flush machinery related fields into this structure, so that

	- flush implementation details aren't exposed to driver
	- it is easy to convert to per dispatch-queue flush machinery

This patch is basically a mechanical replacement.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch introduces 'struct blk_flush_queue' and puts all
flush machinery related fields into this structure, so that

	- flush implementation details aren't exposed to driver
	- it is easy to convert to per dispatch-queue flush machinery

This patch is basically a mechanical replacement.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: introduce blk_init_flush and its pair</title>
<updated>2014-09-25T21:22:35+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2014-09-25T15:23:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f355265571440a7db16e784b6edf4e7d26971a03'/>
<id>f355265571440a7db16e784b6edf4e7d26971a03</id>
<content type='text'>
These two temporary functions are introduced for holding flush
initialization and de-initialization, so that we can
introduce 'flush queue' easier in the following patch. And
once 'flush queue' and its allocation/free functions are ready,
they will be removed for sake of code readability.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These two temporary functions are introduced for holding flush
initialization and de-initialization, so that we can
introduce 'flush queue' easier in the following patch. And
once 'flush queue' and its allocation/free functions are ready,
they will be removed for sake of code readability.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: unshared timeout handler</title>
<updated>2014-09-22T18:00:07+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2014-09-13T23:40:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=46f92d42ee37e10970e33891b7b61a342bd97aeb'/>
<id>46f92d42ee37e10970e33891b7b61a342bd97aeb</id>
<content type='text'>
Duplicate the (small) timeout handler in blk-mq so that we can pass
arguments more easily to the driver timeout handler.  This enables
the next patch.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Duplicate the (small) timeout handler in blk-mq so that we can pass
arguments more easily to the driver timeout handler.  This enables
the next patch.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: remove elv_abort_queue and blk_abort_flushes</title>
<updated>2014-06-11T21:31:21+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2014-06-11T11:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2940474af79744411da0cb63b041ad52c57bc443'/>
<id>2940474af79744411da0cb63b041ad52c57bc443</id>
<content type='text'>
elv_abort_queue has no callers, and blk_abort_flushes is only called by
elv_abort_queue.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
elv_abort_queue has no callers, and blk_abort_flushes is only called by
elv_abort_queue.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: allow changing of queue depth through sysfs</title>
<updated>2014-05-20T17:49:02+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-05-20T17:49:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e3a2b3f931f59d5284abd13faf8bded726884ffd'/>
<id>e3a2b3f931f59d5284abd13faf8bded726884ffd</id>
<content type='text'>
For request_fn based devices, the block layer exports a 'nr_requests'
file through sysfs to allow adjusting of queue depth on the fly.
Currently this returns -EINVAL for blk-mq, since it's not wired up.
Wire this up for blk-mq, so that it now also always dynamic
adjustments of the allowed queue depth for any given block device
managed by blk-mq.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For request_fn based devices, the block layer exports a 'nr_requests'
file through sysfs to allow adjusting of queue depth on the fly.
Currently this returns -EINVAL for blk-mq, since it's not wired up.
Wire this up for blk-mq, so that it now also always dynamic
adjustments of the allowed queue depth for any given block device
managed by blk-mq.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>blk-mq: improve support for shared tags maps</title>
<updated>2014-05-13T21:10:52+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-05-13T21:10:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0d2602ca30e410e84e8bdf05c84ed5688e0a5a44'/>
<id>0d2602ca30e410e84e8bdf05c84ed5688e0a5a44</id>
<content type='text'>
This adds support for active queue tracking, meaning that the
blk-mq tagging maintains a count of active users of a tag set.
This allows us to maintain a notion of fairness between users,
so that we can distribute the tag depth evenly without starving
some users while allowing others to try unfair deep queues.

If sharing of a tag set is detected, each hardware queue will
track the depth of its own queue. And if this exceeds the total
depth divided by the number of active queues, the user is actively
throttled down.

The active queue count is done lazily to avoid bouncing that data
between submitter and completer. Each hardware queue gets marked
active when it allocates its first tag, and gets marked inactive
when 1) the last tag is cleared, and 2) the queue timeout grace
period has passed.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds support for active queue tracking, meaning that the
blk-mq tagging maintains a count of active users of a tag set.
This allows us to maintain a notion of fairness between users,
so that we can distribute the tag depth evenly without starving
some users while allowing others to try unfair deep queues.

If sharing of a tag set is detected, each hardware queue will
track the depth of its own queue. And if this exceeds the total
depth divided by the number of active queues, the user is actively
throttled down.

The active queue count is done lazily to avoid bouncing that data
between submitter and completer. Each hardware queue gets marked
active when it allocates its first tag, and gets marked inactive
when 1) the last tag is cleared, and 2) the queue timeout grace
period has passed.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
