<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/block/rbd.c, branch v4.6</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>rbd: report unsupported features to syslog</title>
<updated>2016-04-28T08:07:43+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2016-04-13T12:15:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d3767f0faeda5abdf205f947ae912d48dc70fa06'/>
<id>d3767f0faeda5abdf205f947ae912d48dc70fa06</id>
<content type='text'>
... instead of just returning an error.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
... instead of just returning an error.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: fix rbd map vs notify races</title>
<updated>2016-04-28T08:07:22+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2016-04-15T14:22:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=811c6688774613a78bfa020f64b570b73f6974c8'/>
<id>811c6688774613a78bfa020f64b570b73f6974c8</id>
<content type='text'>
A while ago, commit 9875201e1049 ("rbd: fix use-after free of
rbd_dev-&gt;disk") fixed rbd unmap vs notify race by introducing
an exported wrapper for flushing notifies and sticking it into
do_rbd_remove().

A similar problem exists on the rbd map path, though: the watch is
registered in rbd_dev_image_probe(), while the disk is set up quite
a few steps later, in rbd_dev_device_setup().  Nothing prevents
a notify from coming in and crashing on a NULL rbd_dev-&gt;disk:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
    Call Trace:
     [&lt;ffffffffa0508344&gt;] rbd_watch_cb+0x34/0x180 [rbd]
     [&lt;ffffffffa04bd290&gt;] do_event_work+0x40/0xb0 [libceph]
     [&lt;ffffffff8109d5db&gt;] process_one_work+0x17b/0x470
     [&lt;ffffffff8109e3ab&gt;] worker_thread+0x11b/0x400
     [&lt;ffffffff8109e290&gt;] ? rescuer_thread+0x400/0x400
     [&lt;ffffffff810a5acf&gt;] kthread+0xcf/0xe0
     [&lt;ffffffff810b41b3&gt;] ? finish_task_switch+0x53/0x170
     [&lt;ffffffff810a5a00&gt;] ? kthread_create_on_node+0x140/0x140
     [&lt;ffffffff81645dd8&gt;] ret_from_fork+0x58/0x90
     [&lt;ffffffff810a5a00&gt;] ? kthread_create_on_node+0x140/0x140
    RIP  [&lt;ffffffffa050828a&gt;] rbd_dev_refresh+0xfa/0x180 [rbd]

If an error occurs during rbd map, we have to error out, potentially
tearing down a watch.  Just like on rbd unmap, notifies have to be
flushed, otherwise rbd_watch_cb() may end up trying to read in the
image header after rbd_dev_image_release() has run:

    Assertion failure in rbd_dev_header_info() at line 4722:

     rbd_assert(rbd_image_format_valid(rbd_dev-&gt;image_format));

    Call Trace:
     [&lt;ffffffff81cccee0&gt;] ? rbd_parent_request_create+0x150/0x150
     [&lt;ffffffff81cd4e59&gt;] rbd_dev_refresh+0x59/0x390
     [&lt;ffffffff81cd5229&gt;] rbd_watch_cb+0x69/0x290
     [&lt;ffffffff81fde9bf&gt;] do_event_work+0x10f/0x1c0
     [&lt;ffffffff81107799&gt;] process_one_work+0x689/0x1a80
     [&lt;ffffffff811076f7&gt;] ? process_one_work+0x5e7/0x1a80
     [&lt;ffffffff81132065&gt;] ? finish_task_switch+0x225/0x640
     [&lt;ffffffff81107110&gt;] ? pwq_dec_nr_in_flight+0x2b0/0x2b0
     [&lt;ffffffff81108c69&gt;] worker_thread+0xd9/0x1320
     [&lt;ffffffff81108b90&gt;] ? process_one_work+0x1a80/0x1a80
     [&lt;ffffffff8111b02d&gt;] kthread+0x21d/0x2e0
     [&lt;ffffffff8111ae10&gt;] ? kthread_stop+0x550/0x550
     [&lt;ffffffff82022802&gt;] ret_from_fork+0x22/0x40
     [&lt;ffffffff8111ae10&gt;] ? kthread_stop+0x550/0x550
    RIP  [&lt;ffffffff81ccd8f9&gt;] rbd_dev_header_info+0xa19/0x1e30

To fix this, a) check if RBD_DEV_FLAG_EXISTS is set before calling
revalidate_disk(), b) move ceph_osdc_flush_notifies() call into
rbd_dev_header_unwatch_sync() to cover rbd map error paths and c) turn
header read-in into a critical section.  The latter also happens to
take care of rbd map foo@bar vs rbd snap rm foo@bar race.

Fixes: http://tracker.ceph.com/issues/15490

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A while ago, commit 9875201e1049 ("rbd: fix use-after free of
rbd_dev-&gt;disk") fixed rbd unmap vs notify race by introducing
an exported wrapper for flushing notifies and sticking it into
do_rbd_remove().

A similar problem exists on the rbd map path, though: the watch is
registered in rbd_dev_image_probe(), while the disk is set up quite
a few steps later, in rbd_dev_device_setup().  Nothing prevents
a notify from coming in and crashing on a NULL rbd_dev-&gt;disk:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
    Call Trace:
     [&lt;ffffffffa0508344&gt;] rbd_watch_cb+0x34/0x180 [rbd]
     [&lt;ffffffffa04bd290&gt;] do_event_work+0x40/0xb0 [libceph]
     [&lt;ffffffff8109d5db&gt;] process_one_work+0x17b/0x470
     [&lt;ffffffff8109e3ab&gt;] worker_thread+0x11b/0x400
     [&lt;ffffffff8109e290&gt;] ? rescuer_thread+0x400/0x400
     [&lt;ffffffff810a5acf&gt;] kthread+0xcf/0xe0
     [&lt;ffffffff810b41b3&gt;] ? finish_task_switch+0x53/0x170
     [&lt;ffffffff810a5a00&gt;] ? kthread_create_on_node+0x140/0x140
     [&lt;ffffffff81645dd8&gt;] ret_from_fork+0x58/0x90
     [&lt;ffffffff810a5a00&gt;] ? kthread_create_on_node+0x140/0x140
    RIP  [&lt;ffffffffa050828a&gt;] rbd_dev_refresh+0xfa/0x180 [rbd]

If an error occurs during rbd map, we have to error out, potentially
tearing down a watch.  Just like on rbd unmap, notifies have to be
flushed, otherwise rbd_watch_cb() may end up trying to read in the
image header after rbd_dev_image_release() has run:

    Assertion failure in rbd_dev_header_info() at line 4722:

     rbd_assert(rbd_image_format_valid(rbd_dev-&gt;image_format));

    Call Trace:
     [&lt;ffffffff81cccee0&gt;] ? rbd_parent_request_create+0x150/0x150
     [&lt;ffffffff81cd4e59&gt;] rbd_dev_refresh+0x59/0x390
     [&lt;ffffffff81cd5229&gt;] rbd_watch_cb+0x69/0x290
     [&lt;ffffffff81fde9bf&gt;] do_event_work+0x10f/0x1c0
     [&lt;ffffffff81107799&gt;] process_one_work+0x689/0x1a80
     [&lt;ffffffff811076f7&gt;] ? process_one_work+0x5e7/0x1a80
     [&lt;ffffffff81132065&gt;] ? finish_task_switch+0x225/0x640
     [&lt;ffffffff81107110&gt;] ? pwq_dec_nr_in_flight+0x2b0/0x2b0
     [&lt;ffffffff81108c69&gt;] worker_thread+0xd9/0x1320
     [&lt;ffffffff81108b90&gt;] ? process_one_work+0x1a80/0x1a80
     [&lt;ffffffff8111b02d&gt;] kthread+0x21d/0x2e0
     [&lt;ffffffff8111ae10&gt;] ? kthread_stop+0x550/0x550
     [&lt;ffffffff82022802&gt;] ret_from_fork+0x22/0x40
     [&lt;ffffffff8111ae10&gt;] ? kthread_stop+0x550/0x550
    RIP  [&lt;ffffffff81ccd8f9&gt;] rbd_dev_header_info+0xa19/0x1e30

To fix this, a) check if RBD_DEV_FLAG_EXISTS is set before calling
revalidate_disk(), b) move ceph_osdc_flush_notifies() call into
rbd_dev_header_unwatch_sync() to cover rbd map error paths and c) turn
header read-in into a critical section.  The latter also happens to
take care of rbd map foo@bar vs rbd snap rm foo@bar race.

Fixes: http://tracker.ceph.com/issues/15490

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: use GFP_NOIO consistently for request allocations</title>
<updated>2016-04-05T20:11:37+00:00</updated>
<author>
<name>David Disseldorp</name>
<email>ddiss@suse.de</email>
</author>
<published>2016-04-05T09:13:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2224d879c7c0f85c14183ef82eb48bd875ceb599'/>
<id>2224d879c7c0f85c14183ef82eb48bd875ceb599</id>
<content type='text'>
As of 5a60e87603c4c533492c515b7f62578189b03c9c, RBD object request
allocations are made via rbd_obj_request_create() with GFP_NOIO.
However, subsequent OSD request allocations in rbd_osd_req_create*()
use GFP_ATOMIC.

With heavy page cache usage (e.g. OSDs running on same host as krbd
client), rbd_osd_req_create() order-1 GFP_ATOMIC allocations have been
observed to fail, where direct reclaim would have allowed GFP_NOIO
allocations to succeed.

Cc: stable@vger.kernel.org # 3.18+
Suggested-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Suggested-by: Neil Brown &lt;neilb@suse.com&gt;
Signed-off-by: David Disseldorp &lt;ddiss@suse.de&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As of 5a60e87603c4c533492c515b7f62578189b03c9c, RBD object request
allocations are made via rbd_obj_request_create() with GFP_NOIO.
However, subsequent OSD request allocations in rbd_osd_req_create*()
use GFP_ATOMIC.

With heavy page cache usage (e.g. OSDs running on same host as krbd
client), rbd_osd_req_create() order-1 GFP_ATOMIC allocations have been
observed to fail, where direct reclaim would have allowed GFP_NOIO
allocations to succeed.

Cc: stable@vger.kernel.org # 3.18+
Suggested-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Suggested-by: Neil Brown &lt;neilb@suse.com&gt;
Signed-off-by: David Disseldorp &lt;ddiss@suse.de&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: use KMEM_CACHE macro</title>
<updated>2016-03-25T17:51:56+00:00</updated>
<author>
<name>Geliang Tang</name>
<email>geliangtang@163.com</email>
</author>
<published>2016-03-13T07:17:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=03d9440676163e965cb77d03c102b461d8ccb482'/>
<id>03d9440676163e965cb77d03c102b461d8ccb482</id>
<content type='text'>
Use KMEM_CACHE() instead of kmem_cache_create() to simplify the code.

Signed-off-by: Geliang Tang &lt;geliangtang@163.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use KMEM_CACHE() instead of kmem_cache_create() to simplify the code.

Signed-off-by: Geliang Tang &lt;geliangtang@163.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: enable large, variable-sized OSD requests</title>
<updated>2016-03-25T17:51:43+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2016-02-09T16:50:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3f1af42ad0fad8a12242233dd0d9fc42f5e83415'/>
<id>3f1af42ad0fad8a12242233dd0d9fc42f5e83415</id>
<content type='text'>
Turn r_ops into a flexible array member to enable large, consisting of
up to 16 ops, OSD requests.  The use case is scattered writeback in
cephfs and, as far as the kernel client is concerned, 16 is just a made
up number.

r_ops had size 3 for copyup+hint+write, but copyup is really a special
case - it can only happen once.  ceph_osd_request_cache is therefore
stuffed with num_ops=2 requests, anything bigger than that is allocated
with kmalloc().  req_mempool is backed by ceph_osd_request_cache, which
means either num_ops=1 or num_ops=2 for use_mempool=true - all existing
users (ceph_writepages_start(), ceph_osdc_writepages()) are fine with
that.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Turn r_ops into a flexible array member to enable large, consisting of
up to 16 ops, OSD requests.  The use case is scattered writeback in
cephfs and, as far as the kernel client is concerned, 16 is just a made
up number.

r_ops had size 3 for copyup+hint+write, but copyup is really a special
case - it can only happen once.  ceph_osd_request_cache is therefore
stuffed with num_ops=2 requests, anything bigger than that is allocated
with kmalloc().  req_mempool is backed by ceph_osd_request_cache, which
means either num_ops=1 or num_ops=2 for use_mempool=true - all existing
users (ceph_writepages_start(), ceph_osdc_writepages()) are fine with
that.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>libceph: move r_reply_op_{len,result} into struct ceph_osd_req_op</title>
<updated>2016-03-25T17:51:42+00:00</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2016-01-07T08:48:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7665d85b7307fa0218881bc2009de067c42dc52e'/>
<id>7665d85b7307fa0218881bc2009de067c42dc52e</id>
<content type='text'>
This avoids defining large array of r_reply_op_{len,result} in
in struct ceph_osd_request.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This avoids defining large array of r_reply_op_{len,result} in
in struct ceph_osd_request.

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: delete an unnecessary check before rbd_dev_destroy()</title>
<updated>2016-01-21T18:36:07+00:00</updated>
<author>
<name>Markus Elfring</name>
<email>elfring@users.sourceforge.net</email>
</author>
<published>2015-11-23T19:16:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1761b22966e61494f51be76bc3b10e9c1ff809ad'/>
<id>1761b22966e61494f51be76bc3b10e9c1ff809ad</id>
<content type='text'>
The rbd_dev_destroy() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The rbd_dev_destroy() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: don't put snap_context twice in rbd_queue_workfn()</title>
<updated>2015-12-04T13:29:18+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-11-27T18:23:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=70b16db86f564977df074072143284aec2cb1162'/>
<id>70b16db86f564977df074072143284aec2cb1162</id>
<content type='text'>
Commit 4e752f0ab0e8 ("rbd: access snapshot context and mapping size
safely") moved ceph_get_snap_context() out of rbd_img_request_create()
and into rbd_queue_workfn(), adding a ceph_put_snap_context() to the
error path in rbd_queue_workfn().  However, rbd_img_request_create()
consumes a ref on snapc, so calling ceph_put_snap_context() after
a successful rbd_img_request_create() leads to an extra put.  Fix it.

Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 4e752f0ab0e8 ("rbd: access snapshot context and mapping size
safely") moved ceph_get_snap_context() out of rbd_img_request_create()
and into rbd_queue_workfn(), adding a ceph_put_snap_context() to the
error path in rbd_queue_workfn().  However, rbd_img_request_create()
consumes a ref on snapc, so calling ceph_put_snap_context() after
a successful rbd_img_request_create() leads to an extra put.  Fix it.

Cc: stable@vger.kernel.org # 3.18+
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Josh Durgin &lt;jdurgin@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: remove duplicate calls to rbd_dev_mapping_clear()</title>
<updated>2015-11-02T22:36:48+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-10-22T14:44:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4afb04c0c88e21f37e5ef4776e432907d7b12838'/>
<id>4afb04c0c88e21f37e5ef4776e432907d7b12838</id>
<content type='text'>
Commit d1cf5788450e ("rbd: set mapping info earlier") defined
rbd_dev_mapping_clear(), but, just a few days after, commit
f35a4dee14c3 ("rbd: set the mapping size and features later") moved
rbd_dev_mapping_set() calls and added another rbd_dev_mapping_clear()
call instead of moving the old one.  Around the same time, another
duplicate was introduced in rbd_dev_device_release() - kill both.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit d1cf5788450e ("rbd: set mapping info earlier") defined
rbd_dev_mapping_clear(), but, just a few days after, commit
f35a4dee14c3 ("rbd: set the mapping size and features later") moved
rbd_dev_mapping_set() calls and added another rbd_dev_mapping_clear()
call instead of moving the old one.  Around the same time, another
duplicate was introduced in rbd_dev_device_release() - kill both.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: set device_type::release instead of device::release</title>
<updated>2015-11-02T22:36:48+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-10-16T18:11:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6cac4695f2042a1d0e17aa48c5705f69907e74c3'/>
<id>6cac4695f2042a1d0e17aa48c5705f69907e74c3</id>
<content type='text'>
No point in providing an empty device_type::release callback and then
setting device::release for each rbd_dev dynamically.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No point in providing an empty device_type::release callback and then
setting device::release for each rbd_dev dynamically.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
